Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup of duplicate or broken feeds #307

Open
5 tasks done
Marzal opened this issue Oct 30, 2023 · 38 comments
Open
5 tasks done

Cleanup of duplicate or broken feeds #307

Marzal opened this issue Oct 30, 2023 · 38 comments

Comments

@Marzal
Copy link
Contributor

Marzal commented Oct 30, 2023

@stevencrader
Copy link
Collaborator

Thanks. I resolved all the issues.

@Marzal
Copy link
Contributor Author

Marzal commented Oct 31, 2023

Thanks! Should I close this issue or do you prefer that I use it for the next reports?

As:

@stevencrader
Copy link
Collaborator

I removed those duplicates also. I'm fine if you keep this open.

@stevencrader
Copy link
Collaborator

I removed the dupes as requested.

I do have a question for @daveajones about how ivoxx feeds are handled. Do you have a "best practices" for these feeds? Ivoxx seems to provide different URLS in different places. They also seem to ignore part of the URL. For example, the following feeds all return 200 and the same content.

@daveajones
Copy link
Contributor

Thanks guys.

@stevencrader Let me look through those and respond.

@cisene
Copy link

cisene commented Nov 6, 2023

I've rewritten URL like the customized variant at the bottom, like @stevencrader .. https://www.ivoox.com/_fg_f1922056.xml .. as that is how it was "discovered". Ivoox also dabbles around with different subdomains, www, mx or any TLDcc of spanish speaking countries, I've tried to normalize these before submitting to the Index.

Pattern: https://www.ivoox.com/_fg_f{identity}_filtro_1.xml

Also: When verified and followed, final link submitted to PI

@ThomasUmstattd
Copy link

I found a cause of a lot of duplicate feeds in the index. PowerPress has a feature called "Enhance All Feeds" that turns every website RSS feed into a podcast feed. So example.com/feed is a podcast feed, as is example.com/feed/podcast. Unless the site is also a blog, these will have the same episodes.

Example: https://podcastindex.org/podcast/2123998 and https://podcastindex.org/podcast/4430.

PowerPress runs on 40k sites, and since this is the recommended setting, this causes tens of thousands of duplicate entries.
Screenshot 2023-11-13 at 5 56 20 PM

In 99% of cases, "example.com/feed/podcast" is the primary/original feed.

@daveajones
Copy link
Contributor

Will look into this.

@stevencrader
Copy link
Collaborator

stevencrader commented Jan 21, 2024

I can't replace a Feed URL directly. Instead, I added the desired Feed and marked the others as duplicates. See https://podcastindex.org/podcast/6761187

Same as above. See https://podcastindex.org/podcast/6761188

Done

@daveajones
Copy link
Contributor

Instead, I added the desired Feed and marked the others as duplicates.

that’s what I do most of the time too.

@Marzal
Copy link
Contributor Author

Marzal commented Feb 1, 2024

Hi!

Question:

If the usual fix is to delete both entries/items in PI DB and create a new one with the better feed, would the AP actor be lost in case some is following @[email protected] on the Fediverse?

@stevencrader
Copy link
Collaborator

That is an interesting question. I'm not sure how @daveajones is handling duplicates and the AP bridge.

@stevencrader
Copy link
Collaborator

stevencrader commented Feb 26, 2024

I merged 6523766 in to 6518287.

The P20 Feed was also added as https://podcastindex.org/podcast/6809087

I've messaged @daveajones about the AP issue but not sure how it will be treated. See https://podcastindex.social/@steven/111910978193046513

@Marzal
Copy link
Contributor Author

Marzal commented Apr 25, 2024

Hi new batch of dups:

Not sure if the better approach is to add "source" (from official website) to PI or leave PI 252695 as the good one.

@stevencrader
Copy link
Collaborator

Thanks. I cleaned up the duplicates. I tried to fix the feed URL of 252695 but am unable to because there is a dead id (4212695) that uses that url. I want to keep 252695 because it has the iTunes association. The iTunes DB also uses the feed URL I'm trying to set.

https://www.spreaker.com/show/3681678/episodes/feed

Do you have any ideas @daveajones ?

@Marzal
Copy link
Contributor Author

Marzal commented Jun 19, 2024

Greetings , fellow podcast lovers, new duplicates:

Good one (with Apple ID): https://podcastindex.org/podcast/171865 - But need refresh for the artwork?

PD: The new https://episodes.fm icon , it's a fastest way to find the one with Apple ID, before I was using podnews directory.

@daveajones
Copy link
Contributor

Greetings , fellow podcast lovers, new duplicates:

Good one (with Apple ID): https://podcastindex.org/podcast/171865 - But need refresh for the artwork?

PD: The new https://episodes.fm icon , it's a fastest way to find the one with Apple ID, before I was using podnews directory.

Fixed!

@Marzal
Copy link
Contributor Author

Marzal commented Jun 21, 2024

@stevencrader
Copy link
Collaborator

New founds:

Fixed

@Marzal
Copy link
Contributor Author

Marzal commented Jun 27, 2024

The one with the itunes ID : https://podcastindex.org/podcast/161452 (but no image and not all episodes en PI, feed OK, enclosure same as feedpress) - feedburner (seems like a mirror of feedpress) ¿refresh/reset needed?

So 1453442 or 161452 if Apple ID is preferred for ¿Overcast fallback?

@stevencrader
Copy link
Collaborator

Thanks. I kept 161452

@Marzal
Copy link
Contributor Author

Marzal commented Jun 29, 2024

@stevencrader
Copy link
Collaborator

Done. Thanks!

@rlarzac
Copy link

rlarzac commented Jul 7, 2024

@stevencrader
Copy link
Collaborator

@Marzal
Copy link
Contributor Author

Marzal commented Jul 24, 2024

Hi, good summer everyone. New batch:

9 decibelios:

ivoox OK : https://podcastindex.org/podcast/2112302 🆗

Eyes On Success:

The one with Apple ID: https://podcastindex.org/podcast/314749 🆗

Audio momentos:

The one with Apple ID: https://podcastindex.org/podcast/955379 🆗

Frecuencia Improvisada

The one with Apple ID: https://podcastindex.org/podcast/336764 🆗

@stevencrader
Copy link
Collaborator

stevencrader commented Jul 27, 2024

Thanks

@daveajones Not sure when they were added but should your auto dupe checker caught the difference in the feed URLs for these 2 feed IDs?

@Marzal
Copy link
Contributor Author

Marzal commented Aug 13, 2024

Hi, what's the policy about dead inaccessible podcast? This one is dead both in web and the enclosures:

@daveajones
Copy link
Contributor

Do you mean, when do they get removed after 404?

@Marzal
Copy link
Contributor Author

Marzal commented Aug 14, 2024

A few doubts really:
Are they auto removed? if ..

  • Feed 404
  • Enclosures 404

And if they are not in some case, would manual reporting (like I did) will get them removed even if one is on Apple Podcast, or the API compatibility for Marco (not sure if for others) is only for podcasts that actually work?

So I know what to report and a bit of curiosity too.

PD: Have you considered exposing to the PI web or API since when the crawlers have detected that the Feed is 404, so people or apps are warned that this podcast could not work or be removed?

@Marzal
Copy link
Contributor Author

Marzal commented Sep 2, 2024

Hi, new batch to add to #307 (comment)

Adolescencia Positiva:

Sé feliz donde estés:

Universo Hijos:

Educación Respetuosa:

Inversapiens - Todos Somos Inversionistas:

@stevencrader
Copy link
Collaborator

Sé feliz donde estés

Since the 2 with the ivox feed are both in Apple so I left them.

@Marzal
Copy link
Contributor Author

Marzal commented Sep 9, 2024

Mujeres con Historia:

@stevencrader
Copy link
Collaborator

Done

@Marzal
Copy link
Contributor Author

Marzal commented Sep 20, 2024

Producción Propia

@stevencrader
Copy link
Collaborator

Done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants