Skip to content

Mixed case host in VirtualService causes RDS to go STALE. #49638

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks done
andy-townsend opened this issue Feb 29, 2024 · 18 comments · Fixed by #49674
Closed
2 tasks done

Mixed case host in VirtualService causes RDS to go STALE. #49638

andy-townsend opened this issue Feb 29, 2024 · 18 comments · Fixed by #49674
Assignees

Comments

@andy-townsend
Copy link

andy-townsend commented Feb 29, 2024

Is this the right place to submit this?

  • This is not a security vulnerability or a crashing bug
  • This is not a question about how to use Istio

Bug Description

Last week we ran into an issue where a duplicate VirtualService entry was added to one of our clusters and it causes the RDS status to go STALE. I've been able to duplicate the issue and its down to where there's mixed case in the hosts entry. eg:

  hosts:
  - test-namespace-HELLO.int.lab-ie-01.mydomain.uk

and

  hosts:
  - test-namespace-hello.int.lab-ie-01.mydomain.uk

We get the following error in the logs

{"level":"warning","time":"2024-02-29T17:19:42.202215Z","scope":"envoy config","msg":"gRPC config for type.googleapis.com/envoy.config.route.v3.RouteConfiguration rejected: Only unique values for domains are permitted. Duplicate entry of domain test-namespace-hello.int.lab-ie-01.mydomain.uk in route http.29000","caller":"external/envoy/source/extensions/config_subscription/grpc/grpc_subscription_impl.cc:138","thread":15}

I can replicate the issue by applying this yaml to the cluster.

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: test-duplicate-vs-1
  namespace: test-namespace
spec:
  gateways:
  - ingress-private/private-cluster
  - ingress-private/private-env
  hosts:
  - test-namespace-HELLO.int.lab-ie-01.mydomain.uk
  http:
  - match:
    - uri:
        prefix: /
    route:
    - destination:
        host: hello-world.test-namespace.svc.cluster.local
        port:
          number: 8080
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: test-duplicate-vs-2
  namespace: test-namespace
spec:
  gateways:
  - ingress-private/private-cluster
  - ingress-private/private-env
  hosts:
  - test-namespace-hello.int.lab-ie-01.mydomain.uk
  http:
  - match:
    - uri:
        prefix: /
    route:
    - destination:
        host: hello-world.test-namespace.svc.cluster.local
        port:
          number: 8080

We found that this broke routing within the cluster for this service which started generating errors. I can see there's this issue where DNS names were allowed to be case-insensitive and also this issue that prevents duplicate routes (but doesn't account for case?).

Version

$ istioctl version 
client version: 1.20.3
control plane version: 1.20.3
data plane version: 1.20.3 (32 proxies)

$ kubectl version
Client Version: v1.28.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.5-eks-5e0fdde

Additional Information

No response

@howardjohn
Copy link
Member

cc @ramaraochavali

@hzxuzhonghu
Copy link
Member

We should ignore case in the domain

@hzxuzhonghu
Copy link
Member

@howardjohn
Copy link
Member

https://www.rfc-editor.org/rfc/rfc7230#section-3.2 all headers are case-insenstive

...case-insensitive field name..

Just the header name is, this is about the header value. But Host specifically is case insensitive so I agree we should ignore the case here

@ramaraochavali ramaraochavali self-assigned this Mar 2, 2024
@ramaraochavali
Copy link
Contributor

Can you please share gateways as well? I guess you have similar duplicate host name in gateways? We have fixed this for sidecar but not gateways.

@andy-townsend
Copy link
Author

For the gateway config we're using a wildcard like this;

---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  annotations:
    meta.helm.sh/release-name: private-cluster
    meta.helm.sh/release-namespace: ingress-private
  name: private-cluster
  namespace: ingress-private
spec:
  selector:
    app: istio-ingressgateway
  servers:
  - hosts:
    - int.lab-ie-01.mydomain.uk
    - '*.int.lab-ie-01.mydomain.uk'
    port:
      name: http
      number: 29000
      protocol: HTTP
    tls:
      httpsRedirect: true
  - hosts:
    - int.lab-ie-01.mydomain.uk
    - '*.int.lab-ie-01.mydomain.uk'
    port:
      name: https
      number: 29001
      protocol: HTTPS
    tls:
      cipherSuites:
      - ECDHE-RSA-CHACHA20-POLY1305
      - ECDHE-RSA-AES256-GCM-SHA384
      - ECDHE-RSA-AES256-SHA
      credentialName: private-cluster-tls
      maxProtocolVersion: TLSV1_3
      minProtocolVersion: TLSV1_2
      mode: SIMPLE

@andy-townsend
Copy link
Author

@ramaraochavali - I see this is closed as fixed by #49674 but do you know when that will be released?

@hzxuzhonghu
Copy link
Member

In 1.22,and need to backport

@jcetkov
Copy link

jcetkov commented Dec 12, 2024

looks this "fix" introduced a similar bug, where if the host is mixed case, but in sync in the VS and GW there was no issue prior to this "bugfix" but with it, the same configuration now fails, as the host in GW gets lowercased, but the one in VS doesn't. tested on istiod-1-21-5-asm-12

EDIT: the important part to reproduce it is to have a plain redirected to tls in the gateway

      tls:
        httpsRedirect: true

@ramaraochavali
Copy link
Contributor

Can you please share the exact config you used for me to reproduce?

@jcetkov
Copy link

jcetkov commented Dec 13, 2024

@ramaraochavali
I'm omitting the secret, but this should be a minimal config for gw/vs to repro.
Note: if I make the foo.bar.com lowercase in either the VS or the GW, the issue goes away.
if I don't configure the httpsRedirect the issue goes away.
This works on 1-19-10-asm-0 and breaks on 1-21-5-asm-12 (those are version I'm migrating between right now). I strongly suspect it's a regression related to a fix to this issue, where it's likely not lowercased in all the places it should be.

2024-12-13T22:27:23.005678Z	warning	envoy config external/envoy/source/extensions/config_subscription/grpc/grpc_subscription_impl.cc:138	gRPC config for type.googleapis.com/envoy.config.route.v3.RouteConfiguration rejected: Only unique values for domains are permitted. Duplicate entry of domain foo.bar.com in route http.18080	thread=20
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: gateway-foo
  namespace: foo
spec:
  selector:
    istio: ingressgateway
  servers:
  - hosts:
    - Foo.bar.com
    port:
      name: 443-foo
      number: 443
      protocol: HTTPS
    tls:
      credentialName: foo.bar.com
      minProtocolVersion: TLSV1_2
      mode: SIMPLE
  - hosts:
    - Foo.bar.com
    port:
      name: http
      number: 80
      protocol: HTTP
    tls:
      httpsRedirect: true
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: vs-foo
  namespace: foo
spec:
  gateways:
  - gateway-foo
  hosts:
  - Foo.bar.com
  http:
  - match:
    - uri:
        prefix: /
    route:
    - destination:
        host: foo
        port:
          number: 8080

@annam8918
Copy link

@ramaraochavali ,
I have installed istio 1.23.0 version and tried the
Create a namescape called mytest and label the istio-injection:

kubectl create ns mytest
kubectl label namespace mytest istio-injection=enabled --overwrite

Apply the below yaml "testVirtualService.yaml"

testVirtualService.yaml:

apiVersion: v1
kind: Service
metadata:
name: httpbin
namespace: mytest
spec:
ports:
- port: 8000
name: http
selector:
app: httpbin

apiVersion: v1
kind: Pod
metadata:
labels:
app: sleep
version: v1
name: sleep
namespace: mytest
spec:
containers:

  • command:
    • /bin/sleep
    • 3650d
      image: curlimages/curl
      name: sleep
      resources:
      limits:
      cpu: 500m
      memory: 500Mi
      requests:
      cpu: 250m
      memory: 64Mi

apiVersion: v1
kind: Pod
metadata:
labels:
app: httpbin
version: v1
name: httpbin
namespace: mytest
spec:
containers:

  • image: docker.io/kennethreitz/httpbin
    name: httpbin
    resources:
    limits:
    cpu: 500m
    memory: 500Mi
    requests:
    cpu: 250m
    memory: 64Mi
    ports:
    - containerPort: 8000

apiVersion: v1
kind: Pod
metadata:
labels:
app: httpbin
version: v2
name: httpbin2
namespace: mytest
spec:
containers:

  • image: docker.io/kennethreitz/httpbin
    name: httpbin2
    resources:
    limits:
    cpu: 500m
    memory: 500Mi
    requests:
    cpu: 250m
    memory: 64Mi
    ports:
    - containerPort: 8000

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: httpbin
namespace: mytest
spec:
hosts:
- httpbin.mytest.svc.cluster.local
http:
- route:
- destination:
host: httpbin.mytest.svc.cluster.local
subset: v1
weight: 20
- destination:
host: httpbin.mytest.svc.cluster.local
subset: v2
weight: 80

If you check the routes for the sleep pod you can see that the VS is assign to the service/port:

❯ istioctl pc route sleep -n mytest --name=8000
NAME DOMAINS MATCH VIRTUAL SERVICE
8000 * /*
8000 httpbin, httpbin.mytest + 1 more... /* httpbin.mytest

Then edit the VS and modify the host field with the uppercate letters "HTTpbin.mytest.svc.cluster.local"

After modifying it if you check the routes again the VS is not assigned:

istioctl pc route sleep -n mytest --name=8000
NAME DOMAINS MATCH VIRTUAL SERVICE
8000 * /*
8000 httpbin, httpbin.mytest + 1 more... /*

seems the fix is provided in 1.23 version but looks like it is not working.
Is my repro is correct or is there anything I am missing ??

bug-report.txt

@ramaraochavali
Copy link
Contributor

That specific fix was for Gateways.

@annam8918
Copy link

That specific fix was for Gateways.

Is there any plan of fixing this for Virtual Services too? Or is it fixed prior to this?

@ramaraochavali
Copy link
Contributor

i think we should fix it. I remember fixing it but not able to find the PR

@annam8918
Copy link

can I create new issue??

@ramaraochavali
Copy link
Contributor

Yes. Please do that

@annam8918
Copy link

created new issue #55767

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants