Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data delivery issues #18031

Open
pavloburykh opened this issue Nov 30, 2023 · 13 comments
Open

Data delivery issues #18031

pavloburykh opened this issue Nov 30, 2023 · 13 comments
Labels
bug message-reliability mobile-core waku All issues relating to the Status Waku integration.

Comments

@pavloburykh
Copy link
Contributor

In this issue we will continue posting data delivery bugs that are found in mobile develop.

@pavloburykh
Copy link
Contributor Author

pavloburykh commented Nov 30, 2023

Currently E2E are failing in different PRs because of the following data delivery issues:

1. Contact request is not delivered to the receiver

Example is taken from this PR

User A sends CR
User B does not receive CR

User A_CR sender.log
User B_CR receiver.log

2. No manual approval required community stucks in Pending after sending request to join

Example taken from this PR

User A sends request to join community. As a result community status is not changed to Joined automatically but stucked in Pending.

User A_community request sender.log
User B_community admin.log

@churik
Copy link
Member

churik commented Dec 7, 2023

Confirming issue 2 on new status-go version.
A pending state persists for > 60 seconds, and then test failed
Go version: 794d72f
Example from #18110

Logs:

@pavloburykh
Copy link
Contributor Author

pavloburykh commented Dec 12, 2023

@vitvly hi! I have closed this issue and posting here the fresh logs and video from nightly build (12/12/23).

User A and User B are members of the same community

Steps:

  1. User B closes the app
  2. User A goes offline and sends stack of messages
  3. User A returns online and waits until all messages are sent
  4. User B opens the app - opens channel
  5. See if User B received all messages sent by User A

Actual result: part of the messages are lost for User B.

On the video you can see that from the last stack of messages User B has not received "Q".

Also, the video shows messages history where from previous stack of messages only “Q, I, O, P” out of “Q, W, E, R, T, Y, U, I, O, P” has been delivered for User B.

User A sender.zip
User B receiver.zip

delivery_bugs.mp4

siddarthkay added a commit to status-im/status-go that referenced this issue Dec 20, 2023
When debugging message reliability we often get the number of messages sent and their IDs but we do not know the content of the messages and the type of message sent.

This commit adds debug level logs so that it helps in investigations.

ref : status-im/status-mobile#18031

Closes [#18206](status-im/status-mobile#18206)
siddarthkay added a commit to status-im/status-go that referenced this issue Dec 20, 2023
When debugging message reliability we often get the number of messages sent and their IDs but we do not know the content of the messages and the type of message sent.

This commit adds debug level logs so that it helps in investigations.

ref : status-im/status-mobile#18031

Closes [#18206](status-im/status-mobile#18206)
@siddarthkay
Copy link
Contributor

so @pavloburykh : I was able to reproduce this exact same issue by sending messages in 1 order and didn't receive them in same order on Device B.
screenshot :
Screenshot 2023-12-20 at 5 06 15 PM

As you can see 2, 3 which were sent by Device A are missing in Device B.
However as I kept the chat open I did see 3 appear in device B. and when I closed chat and opened it again the messages were all there and back in same order.
screenshot :
Screenshot 2023-12-20 at 5 07 05 PM

cc @cammellos

siddarthkay added a commit to status-im/status-go that referenced this issue Dec 20, 2023
When debugging message reliability we often get the number of messages sent and their IDs but we do not know the content of the messages and the type of message sent.

This commit adds debug level logs so that it helps in investigations.

ref : status-im/status-mobile#18031

Closes [#18206](status-im/status-mobile#18206)
siddarthkay added a commit to status-im/status-go that referenced this issue Dec 21, 2023
When debugging message reliability we often get the number of messages sent and their IDs but we do not know the content of the messages and the type of message sent.

This commit adds debug level logs so that it helps in investigations.

ref : status-im/status-mobile#18031

Closes [#18206](status-im/status-mobile#18206)
siddarthkay added a commit to status-im/status-go that referenced this issue Dec 22, 2023
When debugging message reliability we often get the number of messages sent and their IDs but we do not know the content of the messages and the type of message sent.

This commit adds debug level logs so that it helps in investigations.

ref : status-im/status-mobile#18031

Closes [#18206](status-im/status-mobile#18206)
siddarthkay added a commit to status-im/status-go that referenced this issue Dec 22, 2023
When debugging message reliability we often get the number of messages sent and their IDs but we do not know the content of the messages and the type of message sent.

This commit adds debug level logs so that it helps in investigations.

ref : status-im/status-mobile#18031

Closes [#18206](status-im/status-mobile#18206)
siddarthkay added a commit to status-im/status-go that referenced this issue Dec 23, 2023
When debugging message reliability we often get the number of messages sent and their IDs but we do not know the content of the messages and the type of message sent.

This commit adds debug level logs so that it helps in investigations.

ref : status-im/status-mobile#18031

Closes [#18206](status-im/status-mobile#18206)
siddarthkay added a commit to status-im/status-go that referenced this issue Dec 23, 2023
When debugging message reliability we often get the number of messages sent and their IDs but we do not know the content of the messages and the type of message sent.

This commit adds debug level logs so that it helps in investigations.

ref : status-im/status-mobile#18031

Closes [#18206](status-im/status-mobile#18206)
siddarthkay added a commit to status-im/status-go that referenced this issue Dec 23, 2023
When debugging message reliability we often get the number of messages sent and their IDs but we do not know the content of the messages and the type of message sent.

This commit adds debug level logs so that it helps in investigations.

ref : status-im/status-mobile#18031

Closes [#18206](status-im/status-mobile#18206)
siddarthkay added a commit to status-im/status-go that referenced this issue Dec 23, 2023
When debugging message reliability we often get the number of messages sent and their IDs but we do not know the content of the messages and the type of message sent.

This commit adds debug level logs so that it helps in investigations.

ref : status-im/status-mobile#18031

Closes [#18206](status-im/status-mobile#18206)
@pavloburykh
Copy link
Contributor Author

pavloburykh commented Feb 5, 2024

UPDATE (Feb 5, 2024): currently we are facing global data delivery issues that have been caught by our e2e in different PRs. Below are couple examples. cc @cammellos @chaitanyaprem @vitvly

1. Community status has not been updated from Pending to Joined for community request sender

Example of PR #18602

Preconditions: User A is an admin of non token gated community which does not require admin's manual approval (hereinafter 'Community')

Steps:

  1. User A shares Community link to User B and stays online
  2. User B sends Community join request
  3. See if User B joined the community

Actual result: Community status remains in Pending for User B.

Expected result: Community status should change to Joined for User B.

User_A_comm_admin.log
User_B_com_request_sender.log

2. Message sent in community is not visible to other community member

Example of the PR #18438

Preconditions: User A and User B are members of the same community.

Steps:

  1. User A sends message in Community
  2. See if message is visible for User B

Actual result: User B does not see a message in Community

User_A_comm_message_sender.log
User_B_comm_message_receiver.log

@chaitanyaprem
Copy link

2. Message sent in community is not visible to other community member

Example of the PR #18438

Preconditions: User A and User B are members of the same community.

Steps:

1. User A [sends](https://app.eu-central-1.saucelabs.com/tests/651b78966ba948d89efb655e8e6a641f?auth=23e2eec0f06bc9216d647f81154aedb7#167) message in Community

2. See if message is visible for User B

Actual result: User B does not see a message in Community

User_A_comm_message_sender.log User_B_comm_message_receiver.log

Looking at logs seems to be the issue of subscribe and unsubscribe order status-im/status-go#4659 . Similar overlap is noticed in fleet nodes which must have caused message not to be delivered.

Screenshot 2024-02-05 at 7 06 10 PM
16807 DEBUG[02-05|09:44:33.184|github.com/status-im/status-go/vendor/github.com/waku-org/go-waku/waku/v2/protocol/filter/client.go:239]            sending FilterSubscribeRe      quest           peerID=16Uiu2HAm8mUZ18tBWPXDQsaF7PbCKYA35z7WB2xNZH2EVq1qS8LJ request="request_id:\"0c15df32830e1c65406c8023ddeaee44fc7bf9032ae724b32fe0dd618673daef\"        filter_subscribe_type:UNSUBSCRIBE  pubsub_topic:\"/waku/2/rs/16/32\"  content_topics:\"/waku/1/0x667d7b3f/rfc26\""
16808 DEBUG[02-05|09:44:33.184|github.com/status-im/status-go/vendor/github.com/waku-org/go-waku/waku/v2/protocol/filter/client.go:239]            sending FilterSubscribeRe      quest           peerID=16Uiu2HAmGwcE8v7gmJNEWFtZtojYpPMTHy2jBLL6xRk33qgDxFWX request="request_id:\"50d4c03ea9182991fd453e0f50764d7e4d8437d3a1bb49a1ca316b5729105797\"        filter_subscribe_type:SUBSCRIBE  pubsub_topic:\"/waku/2/rs/16/32\"  content_topics:\"/waku/1/0x667d7b3f/rfc26\""                            
16809 DEBUG[02-05|09:44:33.184|github.com/status-im/status-go/vendor/go.uber.org/zap/sugar.go:163]                                                 [16Uiu2HAm8fSFpYNEZEoKNW1      xRT9cmB5Pfq9QWFLwVzxbXVe1cBVh] opening stream to peer [16Uiu2HAmGwcE8v7gmJNEWFtZtojYpPMTHy2jBLL6xRk33qgDxFWX] 
16810 DEBUG[02-05|09:44:33.184|github.com/status-im/status-go/wakuv2/filter_manager.go:323]                                                        filter sub is closed                           id=2e4251fe-c1f8-4b85-acef-21152810f9c9
16811 DEBUG[02-05|09:44:33.185|github.com/status-im/status-go/vendor/github.com/waku-org/go-waku/waku/v2/protocol/filter/client.go:239]            sending FilterSubscribeRe      quest           peerID=16Uiu2HAmGwcE8v7gmJNEWFtZtojYpPMTHy2jBLL6xRk33qgDxFWX request="request_id:\"c63f3f97dfeb2ddf459ef6f9cda07ab6407af92f625cea460caed7cb9ff9f361\"        filter_subscribe_type:SUBSCRIBE  pubsub_topic:\"/waku/2/rs/16/32\"  content_topics:\"/waku/1/0x667d7b3f/rfc26\""
16812 DEBUG[02-05|09:44:33.183|github.com/status-im/status-go/wakuv2/filter_manager.go:216]                                                        filter unsubscribe from f      ilter node      filterId=14109b1baf748faed16c1501e296ab6b04dd874154d312d15be935671641c2a1 subId=4d9f79f1-4cf5-4b7a-bcc2-3e18a0618333 peer=16Uiu2HAmGwcE8v7gmJNEWFtZtoj      YpPMTHy2jBLL6xRk33qgDxFWX
16813 DEBUG[02-05|09:44:33.185|github.com/status-im/status-go/vendor/go.uber.org/zap/sugar.go:163]                                                 [16Uiu2HAm8fSFpYNEZEoKNW1      xRT9cmB5Pfq9QWFLwVzxbXVe1cBVh] opening stream to peer [16Uiu2HAmGwcE8v7gmJNEWFtZtojYpPMTHy2jBLL6xRk33qgDxFWX] 
16814 DEBUG[02-05|09:44:33.185|github.com/status-im/status-go/vendor/github.com/waku-org/go-waku/waku/v2/protocol/filter/client.go:239]            sending FilterSubscribeRe      quest           peerID=16Uiu2HAmGwcE8v7gmJNEWFtZtojYpPMTHy2jBLL6xRk33qgDxFWX request="request_id:\"e552f7f01ef9226fc048e8f6179219c05fd4556584e7497d8096f117c122c4c4\"        filter_subscribe_type:UNSUBSCRIBE  pubsub_topic:\"/waku/2/rs/16/32\"  content_topics:\"/waku/1/0x667d7b3f/rfc26\""
16815 DEBUG[02-05|09:44:33.185|github.com/status-im/status-go/wakuv2/filter_manager.go:323]                                                        filter sub is closed                           id=4d9f79f1-4cf5-4b7a-bcc2-3e18a0618333

@vitvly
Copy link
Contributor

vitvly commented Feb 5, 2024

The PR that hopes to address the out-of-order problem identified by @chaitanyaprem : status-im/status-go#4665

@pavloburykh
Copy link
Contributor Author

The PR that hopes to address the out-of-order problem identified by @chaitanyaprem : status-im/status-go#4665

Thanks @vitvly! Would you mind creating a corresponding mobile PR once go PR will be reviewed? So we could run e2e and check on mobile side?

@qoqobolo
Copy link
Contributor

qoqobolo commented Feb 5, 2024

Looking at logs seems to be the issue of subscribe and unsubscribe order status-im/status-go#4659 . Similar overlap is noticed in fleet nodes which must have caused message not to be delivered.

Adding another today's example of the message delivery issue in case this is something different from what Pavlo reported:

My.Movie.0502.mp4

(from 0:15 to 2:04 I'm just waiting for delivery, so you can fast-forward that part)

As you can see, some messages are delivered with a delay, some are not delivered at all, and for some delivered messages the delivery status does not change for the sender.

Logs
Sender 05_02.zip
Receiver 05_02.zip

@chaitanyaprem @vitvly @cammellos

@vitvly
Copy link
Contributor

vitvly commented Feb 5, 2024

The PR that hopes to address the out-of-order problem identified by @chaitanyaprem : status-im/status-go#4665

Thanks @vitvly! Would you mind creating a corresponding mobile PR once go PR will be reviewed? So we could run e2e and check on mobile side?

Yes sure, i remember this has to be done, will do.

@VolodLytvynenko
Copy link
Contributor

VolodLytvynenko commented Feb 20, 2024

No manual approval required community stucks in Pending after sending request to join when light client is disabled

hey. Currently E2E is failing in case when the light client disabled. I was able to reproduce this issue manually, but only when both nodes return from being offline as described in the steps below.

Steps:

  1. User A creates no manual approval community and shares to User B
  2. User A goes offline
  3. User B requests to join
  4. User B goes offline
  5. Both users return to online

Actual result:

The community is in 'pending' status for User B

My.Movie.1.mp4

Expected result:

The community is in 'joined' status

Additional info:

Current issue is also reproducible when User A is a Desktop User

Logs:

User A logs:
logs.zip

User B logs:
Status-debug-logs.zip

@chaitanyaprem
Copy link

chaitanyaprem commented Feb 21, 2024

No manual approval required community stucks in Pending after sending request to join when light client is disabled

hey. Currently E2E is failing in case when the light client disabled. I was able to reproduce this issue manually, but only when both nodes return from being offline as described in the steps below.

Steps:

1. `User A` creates  no manual approval community and shares to `User B`

2. `User A`  goes offline

3. `User B` requests to join

4. `User B`  goes offline

5. Both users return to online

Actual result:

The community is in 'pending' status for User B
My.Movie.1.mp4

Expected result:

The community is in 'joined' status

Additional info:

Current issue is also reproducible when User A is a Desktop User

Logs:

User A logs: logs.zip

User B logs: Status-debug-logs.zip

I can't figure out which messages to look at from the logs. Maybe @vitvly or @cammellos can help identify them.

But typically in case of a device going offline and coming back to online, the store query must have fetched messages which are missed.
But one thing i had noticed is this from logs of UserA is there were scenarios where message publishing failed even after retries as device is offline. Not sure how retries are handled for such internal control messages in case of network not being available.
For user messages a retry link is shown, which forces user to resend once connectivity is back on. But for messages like Join community or Accept Contact Request etc, not sure how this is handled. If there are no retries for these when device comes back online, then this could be a cause for loss of state.

@yevh-berdnyk
Copy link
Contributor

One more example from e2e tests on the Feb 21 nightly build where light client is disabled.

Preconditions:

There is a group chat with 3 users.

Steps:

  • User A goes offline (airplane mode)
  • User B sends a message in the group chat with the text message from old member
  • User C sends a message in the group chat with the text message from new member
  • User A goes back online and checks the messages in the group chat
  • User A sends 2 more messages in the group chat: Message 1 and message in the muted chat
  • Users B and C don't receive these messages

Logs:

user_A_logcat.log.zip
user_B_logcat.log.zip
user_C_logcat.log.zip

@cammellos cammellos removed this from the 2.27.0 Alpha milestone Mar 14, 2024
@churik churik added waku All issues relating to the Status Waku integration. mobile-core labels Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug message-reliability mobile-core waku All issues relating to the Status Waku integration.
Projects
Status: No status
Development

No branches or pull requests

9 participants