Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

set-alias-page.py generates misleading pages #12747

Open
acuteenvy opened this issue May 6, 2024 · 7 comments
Open

set-alias-page.py generates misleading pages #12747

acuteenvy opened this issue May 6, 2024 · 7 comments
Labels
bug Issues with our clients or rendering of pages, etc. tooling Helper tools, scripts and automated processes.

Comments

@acuteenvy
Copy link
Member

acuteenvy commented May 6, 2024

The get_alias_page function doesn't do its job properly. This script then works on non-alias pages, and subsequently makes a huge mess by generating misleading pages. It also has some issues with page titles that contain spaces.

Examples: #11365.
Previous runs of this script:

set-alias-page.py should not be used until this is fixed.

@acuteenvy acuteenvy added bug Issues with our clients or rendering of pages, etc. tooling Helper tools, scripts and automated processes. labels May 6, 2024
@sebastiaanspeck
Copy link
Member

We could add a warning in the script to make people aware of this issue. Only mentioning here that the script should not be used, is too less visible for others

@sebastiaanspeck
Copy link
Member

We could add a warning in the script to make people aware of this issue. Only mentioning here that the script should not be used, is too less visible for others

I see that It already has a warning at the top of the script:

Disclaimer: This script generates a lot of false positives so it
isn't suggested to use the sync option. If used, only stage changes
and commit verified changes for your language.

@blueskyson
Copy link
Member

blueskyson commented Jul 22, 2024

I just wrote a script with the help of ChatGPT to filter out potentially affected pages:

Expand to see the code
import os
import re

def filter_files(directory):
    # List to store names of files that meet the criteria
    matching_files = []

    # Iterate through all files in the given directory
    for filename in os.listdir(directory):
        file_path = os.path.join(directory, filename)
        
        # Check if it's a file (and not a directory)
        if os.path.isfile(file_path):
            with open(file_path, 'r', encoding='utf-8') as file:
                lines = file.readlines()
                
                contains_tldr = False
                contains_alias = False
                
                for line in lines:
                    if re.search(r"^`tldr ", line):
                        contains_tldr = True
                    if "This command is an alias of" in line:
                        contains_alias = True
                        
                if contains_tldr and not contains_alias:
                    matching_files.append(filename)
    
    # Print the names of the matching files
    if len(matching_files) > 0:
        print(directory)
        for file in matching_files:
            print(file)
        print()

# Replace 'your_directory_path' with the actual directory path you want to process
repo_path = '/home/lin/Desktop/github/tldr'
filter_files(repo_path + '/pages/osx')
filter_files(repo_path + '/pages/sunos')
filter_files(repo_path + '/pages/openbsd')
filter_files(repo_path + '/pages/android')
filter_files(repo_path + '/pages/freebsd')
filter_files(repo_path + '/pages/windows')
filter_files(repo_path + '/pages/linux')
filter_files(repo_path + '/pages/netbsd')
filter_files(repo_path + '/pages/common')

/home/lin/Desktop/github/tldr/pages/osx

  • launchd.md

/home/lin/Desktop/github/tldr/pages/openbsd

  • pkg.md

/home/lin/Desktop/github/tldr/pages/windows

  • del.md
  • wget.md
  • rmdir.md
  • gal.md
  • ri.md
  • ni.md
  • rm.md
  • iwr.md
  • pwd.md
  • cd.md
  • curl.md
  • clear.md
  • cls.md
  • mv.md
  • mi.md
  • reg.md
  • sl.md
  • gl.md
  • move.md

/home/lin/Desktop/github/tldr/pages/linux

  • pkgctl.md
  • cgroups.md
  • xbps.md
  • tailf.md
  • nmcli.md
  • eselect.md
  • distrobox.md
  • systemd-resolve.md
  • apx.md
  • unmount.md
  • lid.md

/home/lin/Desktop/github/tldr/pages/common

  • linode-cli.md
  • ruff.md
  • transmission.md
  • ppmnorm.md
  • ppmtowinicon.md
  • pgmnorm.md
  • ppmtojpeg.md
  • open.md
  • bundler.md
  • cups.md
  • pngtopnm.md
  • pamfixtrunc.md
  • lckdo.md
  • pbmtoicon.md
  • pcdindex.md
  • ppmtogif.md
  • clamav.md
  • pnmcomp.md
  • pgmslice.md
  • todoman.md
  • pgmedge.md
  • ykman.md
  • pnminterp.md
  • bmptoppm.md
  • moreutils.md
  • pnmflip.md
  • ppmquant.md
  • pgmcrater.md
  • pnmtotiff.md
  • frp.md
  • just.md
  • ps-nvm.md
  • musl-gcc.md
  • tldr.md
  • pamrgbatopng.md
  • pnmarith.md
  • ripgrep.md
  • ppmtomap.md
  • pgmtopbm.md
  • pgmoil.md
  • pnmsplit.md
  • powershell.md
  • pnmenlarge.md
  • winicontoppm.md
  • pbmtox10bm.md
  • ppmquantall.md
  • cron.md
  • pnmcut.md
  • ppmtotga.md
  • pnmfile.md
  • ppmtouil.md
  • icontopbm.md
  • ppmbrighten.md
  • pnmscale.md
  • gemtopbm.md
  • pnmtofits.md

@blueskyson
Copy link
Member

I was too careless writing the set-alias-page script then 😰

@Epik-Whale463
Copy link

Epik-Whale463 commented Sep 27, 2024

I suggest updating the get_alias_page function with a better regex to help accurately identify valid alias pages and manage titles with spaces.

Please let me know if i can work in this with that idea?

@sebastiaanspeck
Copy link
Member

I suggest updating the get_alias_page function with a better regex to help accurately identify valid alias pages and manage titles with spaces.

Please let me know if i can work in this with that idea?

Feel free to give it a shot! We are always open for ideas and suggestions.

@sbrl
Copy link
Member

sbrl commented Oct 2, 2024

Yeah, I think updating the script as @Epik-Whale463 suggests is a good plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issues with our clients or rendering of pages, etc. tooling Helper tools, scripts and automated processes.
Projects
None yet
Development

No branches or pull requests

5 participants