Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for info-hash change behavior #260

Merged
merged 7 commits into from
Aug 25, 2023
Merged

Documentation for info-hash change behavior #260

merged 7 commits into from
Aug 25, 2023

Conversation

josecelano
Copy link
Member

@josecelano josecelano commented Aug 24, 2023

If you upload a torrent with custom fields in the info key dictionary, the torrent infohash changes because we do not parse/store custom fields.

  • Add test
  • Add ADR
  • Store original info-hash in the database (torrust_torrents table)
  • Show a warning to the user after uploading the torrent if the info-hash changes. (frontend task).

@da2ce7
Copy link
Contributor

da2ce7 commented Aug 24, 2023

Do you think that we should detect that we are removing data from the torrent info-dictionary, and provide a warning when uploading the torrent?

Just silently dropping these fields doesn't seem so nice...

@josecelano josecelano linked an issue Aug 24, 2023 that may be closed by this pull request
@josecelano
Copy link
Member Author

Do you think that we should detect that we are removing data from the torrent info-dictionary, and provide a warning when uploading the torrent?

Just silently dropping these fields doesn't seem so nice...

Yes, I think that would be the best solution. For the time being, I was planning just to "statically" inform the user about this behavior. I thought that was what you proposed.

If we want to "dynamically" warn the user when this happens, we have at least two options:

Option 1

  • Parse the uploaded torrent and calculate the right infohash with bip_bencode. I do not know if you want to use the public crate or the forked version we have on the tracker. If we use our version, we have to publish it.
  • Change the upload torrent endpoint response, to include an extra message.

Option 2

This option requires a more complex UX. When you upload a torrent and the infohash changes, we do not save it. We only save a "draft indexed torrent". Then, we show the user a confirmation message that actually stores the torrent.

I think the UI is not ready for this kind of interaction. It probably requires introducing more UI elements.

Option 3

We add a static message in the upload form: "The info-hash, might change if you use custom fields.", linking to the User's Guide.

@da2ce7
Copy link
Contributor

da2ce7 commented Aug 24, 2023

I think that the best option is to store the original info-hash alongside the original filename in the database.

The uploading torrent endpoint should provide a warning "the infohash has changed from X to Y, as we have removed non-standard keys from the info-dictionary." This should be displayed to the user after the successful upload.

@josecelano
Copy link
Member Author

I think that the best option is to store the original info-hash alongside the original filename in the database.

The uploading torrent endpoint should provide a warning "the infohash has changed from X to Y, as we have removed non-standard keys from the info-dictionary." This should be displayed to the user after the successful upload.

OK @da2ce7. I will use the public bip_encode to get the original infohash.

@codecov-commenter
Copy link

codecov-commenter commented Aug 24, 2023

Codecov Report

Merging #260 (a59db9b) into develop (2cf80bc) will increase coverage by 0.38%.
The diff coverage is 50.00%.

@@             Coverage Diff             @@
##           develop     #260      +/-   ##
===========================================
+ Coverage    44.42%   44.80%   +0.38%     
===========================================
  Files           77       77              
  Lines         4115     4111       -4     
===========================================
+ Hits          1828     1842      +14     
+ Misses        2287     2269      -18     
Files Changed Coverage Δ
src/databases/database.rs 30.55% <ø> (+0.82%) ⬆️
src/databases/mysql.rs 0.00% <0.00%> (ø)
src/databases/sqlite.rs 23.61% <0.00%> (ø)
src/web/api/v1/contexts/torrent/handlers.rs 50.62% <0.00%> (+0.31%) ⬆️
src/web/api/v1/contexts/torrent/responses.rs 0.00% <0.00%> (ø)
src/services/torrent.rs 20.54% <15.38%> (-0.23%) ⬇️
src/utils/parse_torrent.rs 64.86% <95.23%> (+39.86%) ⬆️

... and 21 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@josecelano
Copy link
Member Author

josecelano commented Aug 24, 2023

Hi @da2ce7, I think I'm going to let the frontend handle the case when the infohash changes. The response after uploading a torrent is:

/// Response after successfully uploading a new torrent.
pub fn new_torrent_response(torrent_id: TorrentId, info_hash: &str) -> Json<OkResponseData<NewTorrentResponseData>> {
    Json(OkResponseData {
        data: NewTorrentResponseData {
            torrent_id,
            info_hash: info_hash.to_owned(),
        },
    })
}

I think we can simply add a new field with the original info hash value:

/// Response after successfully uploading a new torrent.
pub fn new_torrent_response(torrent_id: TorrentId, info_hash: &str, original_info_hash: &str) -> Json<OkResponseData<NewTorrentResponseData>> {
    Json(OkResponseData {
        data: NewTorrentResponseData {
            torrent_id,
            info_hash: info_hash.to_owned(),
            original_info_hash: original_info_hash.to_owned(),
        },
    })
}

I think that's better than adding an extra message:

/// Response after successfully uploading a new torrent.
pub fn new_torrent_response(torrent_id: TorrentId, info_hash: &str, message: &str) -> Json<OkResponseData<NewTorrentResponseData>> {
    Json(OkResponseData {
        data: NewTorrentResponseData {
            torrent_id,
            info_hash: info_hash.to_owned(),
            message: message.to_owned(), // "InfoHash has change"
        },
    })
}

The frontend can check the response and show the right message according to the received data.

UPDATE

If you think it's OK, I only have to add the field to the response, and the PR will be ready to review.

@josecelano
Copy link
Member Author

Hi @da2ce7, I've been thinking about changing the infohash again, and I do not like the side effect. It has a similar effect to the "source" file. As we change the infohash:

  • All the trackers listed on the torrent file are not going to work because the infohash has changed.
  • Peer can only download the torrent from other peers using the same tracker.

What we are doing is kind of creating a new "normalized" torrent but somehow splitting the swarm.

In a public index/tracker, I think we should generate the same torrent, only adding the linked/associated tracker in the announce list.

@da2ce7 da2ce7 added the Needs Rebase Base Branch has Incompatibilities label Aug 25, 2023
The announce_list is an array of arrays in the intermodal JSON output:

```
"announce-list": [
      [
         "https://academictorrents.com/announce.php"
      ],
      [
         "https://ipv6.academictorrents.com/announce.php"
      ],
      [
         "udp://tracker.opentrackr.org:1337/announce"
      ],
      [
         "udp://tracker.openbittorrent.com:80/announce"
      ],
      [
         "http://bt1.archive.org:6969/announce"
      ],
      [
         "http://bt2.archive.org:6969/announce"
      ]
   ],
```

Not an array of strings.
…ields

in the `info` key dictionary. For example:

```json
{
   "announce": "https://academictorrents.com/announce.php",
   "announce-list": [
      [
         "https://academictorrents.com/announce.php"
      ],
      [
         "https://ipv6.academictorrents.com/announce.php"
      ],
      [
         "udp://tracker.opentrackr.org:1337/announce"
      ],
      [
         "udp://tracker.openbittorrent.com:80/announce"
      ],
      [
         "http://bt1.archive.org:6969/announce"
      ],
      [
         "http://bt2.archive.org:6969/announce"
      ]
   ],
   "comment": "This content hosted at the Internet Archive at https://archive.org/details/rapppid-weights.tar\nFiles may have changed, which prevents torrents from downloading correctly or completely; please check for an updated torrent at https://archive.org/download/rapppid-weights.tar/rapppid-weights.tar_archive.torrent\nNote: retrieval usually requires a client that supports webseeding (GetRight style).\nNote: many Internet Archive torrents contain a 'pad file' directory. This directory and the files within it may be erased once retrieval completes.\nNote: the file rapppid-weights.tar_meta.xml contains metadata about this torrent's contents.",
   "created by": "ia_make_torrent",
   "creation date": 1689273787,
   "info": {
      "collections": [
         "org.archive.rapppid-weights.tar"
      ],
      "files": [
         {
            "crc32": "57d33fcc",
            "length": 11528324,
            "md5": "e91bb4ba82695161be68f8b33ae76142",
            "mtime": "1689273730",
            "path": [
               "RAPPPID Weights.tar.gz"
            ],
            "sha1": "45970ef33cb3049a7a8629e40c8f5e5268d1dc53"
         },
         {
            "crc32": "c658fd4f",
            "length": 20480,
            "md5": "a782b2a53ba49f0d45f3dd6e35e0d593",
            "mtime": "1689273783",
            "path": [
               "rapppid-weights.tar_meta.sqlite"
            ],
            "sha1": "bcb06b3164f1d2aba22ef6046eb80f65264e9fba"
         },
         {
            "crc32": "8140a5c7",
            "length": 1044,
            "md5": "1bab21e50e06ab42d3a77d872bf252e5",
            "mtime": "1689273763",
            "path": [
               "rapppid-weights.tar_meta.xml"
            ],
            "sha1": "b2f0f2bbec34aa9140fb9ac3fcb190588a496aa3"
         }
      ],
      "name": "rapppid-weights.tar",
      "piece length": 524288,
      "pieces": "<hex>AB EC 55 6E 0F 7B E7 D3 30 0C F6 68 8C 90 6D 99 0C 3E 32 B5 2C F2 B6 7C 0C 32 52 BC 72 6F 07 1E 73 AB 76 F1 BC 32 2B FC 21 D4 7F 1A E9 72 35 40 7E C3 B4 89 09 2B ED 4B D8 B0 6C 65 8C 27 58 AE FB 72 75 73 44 37 88 28 20 D2 94 BD A4 6A F8 D2 A6 FD 02 65 1C 1C DF B8 56 6D 3A D2 7E A7 3D CA E2 49 F7 36 8D 17 77 6E 32 AD EF A5 44 C2 8F B6 9C 24 56 AD E8 FB 7B A6 71 C0 81 E5 43 03 91 D4 4F B0 A6 64 CA 29 1B 0D 1D 40 7D 39 4E 76 96 EB 01 18 F3 F5 50 8E 2F FA 54 FC 49 66 85 D8 38 87 78 9B 0A 8F 7A A3 2C 8F 72 36 AD 6D 74 0B FC C5 57 71 86 FB F3 CF CA C9 DA EC 61 62 A2 2A 1B A7 85 89 91 8F AA C0 C0 CB 0D 57 D8 B7 E7 64 4D F2 84 73 76 98 FB 3A 17 48 D7 9C 01 FE CA 6D 1F C5 97 34 05 54 39 DA C2 6E 17 41 11 69 F3 46 D1 7D 16 D3 C0 87 3B C3 B2 0C 1D E0 E2 49 C3 05 D2 4C 00 5A 5B 78 01 12 3E BF 52 43 49 6D 1A EE 23 79 D2 0E 28 B6 84 7E C5 ED 79 DE 64 02 ED 47 71 3D 93 16 C4 A2 76 18 77 54 C5 31 48 96 3A 51 C1 4A 92 90 91 F3 CF 48 5B 24 86 55 A8 EB 0C C6 2D 86 E2 29 56 09 2C 38 0B CD C1 CA 45 E6 64 6A 47 FE BB 2E 47 9A 77 45 29 E9 72 19 20 6F 42 79 2B 37 B9 53 25 ED 0F 29 04 D5 E2 96 F1 DE CF 99 BE 32 AA B8 0A 1D 0B 9F B9 D6 AB 5C 50 43 78 85 41 09 01 24 CF E0 89 76 5B 4D A9 CA 72 C0 DF 92 47 0F 0D CE CA 96 C6 7E A5 41 5F 2B A7 BB 04 CC F7 44 7F 94 1E 24 D2 1B 17 CA 18 79 90 A3 D6 20 75 A2 96 68 06 58 5A DE F5 2C 1A 90 22 72 33 8E D5 B2 A8 FA E5 E9 E7 69 62 02 7C 09 B3 4C</hex>"
   },
   "locale": "en",
   "title": "rapppid-weights.tar",
   "url-list": [
      "https://archive.org/download/",
      "http://ia902702.us.archive.org/22/items/",
      "http://ia802702.us.archive.org/22/items/"
   ]
}
```

Notice the `collections` array inside `info` key.
We can change the final torrent infohash becuase we do not extract/parse
non standard fields in the info dictionary. We want to keep a copy of
the original uploaded torrent's infohash. Si we need to calcuylate the
infohash with all the fields in the info dictionary.

We will sotre it in the database and warn users in the UI  when the infohash
changes.
Torrent InfoHash migth change after uploading it becuase we do not
extract/store all the fields in the info dictionary. We store a copy of
the original infohash in the database field
`torrust_torrents::original_info_hash`.
```json
{
    "data": {
        "torrent_id": 174,
        "info_hash": "8aa01a4c816332045ffec83247ccbc654547fedf",
        "original_info_hash": "6c690018c5786dbbb00161f62b0712d69296df97"
    }
}
```

`original_info_hash` contains the infohash of the original uploaded
torrent file. It migth change if the torrent contained custom fields in
the info dictionary.
@josecelano josecelano removed the Needs Rebase Base Branch has Incompatibilities label Aug 25, 2023
Copy link
Contributor

@da2ce7 da2ce7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 9bb8578

@josecelano
Copy link
Member Author

Hi @da2ce7, I've been thinking about changing the infohash again, and I do not like the side effect. It has a similar effect to the "source" file. As we change the infohash:

  • All the trackers listed on the torrent file are not going to work because the infohash has changed.
  • Peer can only download the torrent from other peers using the same tracker.

What we are doing is kind of creating a new "normalized" torrent but somehow splitting the swarm.

In a public index/tracker, I think we should generate the same torrent, only adding the linked/associated tracker in the announce list.

Hi @da2ce7 I will merge this PR, but what do you think about this ☝🏼?

@josecelano josecelano merged commit 79ff3cc into torrust:develop Aug 25, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Error calculating infohash with custom fields in info dict
3 participants