-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pi4 remaining issues #253
Comments
Update: Issue 1: The interrupt register for the SMI interrupt (fake vsync) is now in IRQ0_PENDING1 on the Pi4 at offset 0x00B204. After changing that the vsync red bar now works (genlocking still not working) Issue 5: Mode7 is now OK it was using the cached screen option but the memory area wasn't actually set as cached which caused the glitches. |
Issue2: Screencaps are now working after changing the compiler options to target arm v6 (like the universal Pi0-3 build) instead of arm v8 so the compiler output for Arm v8 has some issues, perhaps something isn't set up correctly like unaligned access. |
Update: Issue 3: 16bpp display modes now work properly: The display list has moved from offset 0x402000 to 0x404000 |
Nice work Ian.... Have you found some secret source of information, or are you just figuring this out by trying lots of different things? |
Just the Linux source. I looked through for references to bcm2711 or vc5 or just an additional '5' in all the vc4 related files. So I tried that offset. It didn't work but I dumped the display list to the serial port and it was immediately obvious that the screen start address had moved by one word compared to a Pi zero so I changed the code for that and got a working image but with the colours reversed so I then changed the PIXEL_ORDER value in our defs.h and got the correct image. The only remaining issue is the HDMI clock (or clocks as there are two outputs) BTW I think it would be useful to compile a wiki page with all the bare metal resources like the above source files, datasheets, the SMI tutorial, the display list tutorial, raspberry pi forum posts etc which we could add to as we find them. |
This is the output of:
This confirms what I had already worked out by observation that unlike the previous models, PLLA controls the core and the aux uart so that one is not suitable for use as the CPLD sampling clock |
Screencaps are still randomly hanging or throwing error 55 on the Pi4 when calling lodepng and it seems to be build or memory content sensitive as adding logging or making any unrelated change can alter the behaviour. I've disabled unaligned access in the compiler but it hasn't helped with this issue although I will leave it disabled because there can be different behaviour between Arm v6, v7 or v8 which might affect a universal binary. I'm building the Pi4 version with the compiler targeted at Armv8 with cortex A72 optimisation so it should be building with the correct instructions. I also tried updating lodepng to the latest version. |
Possible location of registers affecting pixel clock at base_address + 0xf00f00 (base of hdmi0 phy according to the above links)
|
I now have Pi4 genlocking working on 1080p50 by altering the value in 0xfef00f98 (HDMI_RM_OFFSET) but I still need to work out how the value in that register relates back to the pixel clock value (at the moment the code assumes 148.5Mhz). |
Pi 4 genlocking is now fully working Remaining issues: lodepng crashing I also noticed a problem during startup where the sd card doesn't work properly after a power cycle. It does work after the second power cycle but the error does delay bootup by a few seconds.
I tried adding extra delays around the power cycle but that didn't seem to help. |
The reason why lodepng is crashing is that malloc() is returning garbage addresses:
Might be a bug in the compiler or a buffer overrun which is corrupting the memory allocation tables that only happens on the Pi 4? |
That's very interesting... Honestly I have no idea about how malloc is meant to work in a bare-metal environment, what region of memory it is using, and how that is controlled. This issue likely also affects PiTubeDirect I'll do some digging tomorrow (if I'm not feeling too grotty after being boosted today). If you find the answer in the mean time, do update this thread. |
Just confirming here what I think you already knew (but I didn't). It seems the default implementation of malloc in GCC looks for a symbol called _end that usually follows the bss region and starts the heap there. If I build on the Pi 4, then do a nm I see
So it looks like the heap should start at that point. On the Pi 0-3 build the address is similar: 027d0038 So the Pi 0-3 addresses you see look like they are within the heap, where as the Pi 4 addresses look wrong. I'm not sure how best to track this down. GCC includes heap consistency checking tools, but I don't expect these are available in an ARM Embedded environment: The most likely cause of this is something unrelated to lodepng overflowing an array (or similar) and corrupting the memory allocators view of the heap. Dominic's been using a static analysis tool called cppcheck to improve the code in PiTubeDirect. It might well spot possible causes. I've never used this myself, so I can't really advise you on how to run it. |
Something like this might help: |
I found out where it is happening but it's not fixed yet: The first time the SD card is accessed is in cpld_init() when the "/Delete_This_File_To_Erase_CPLD.txt" is checked
When it works you get:
When it doesn't you get:
If you move the second png encode to between the check_file() and test_file() calls it works so it looks like the error on the check_file() call leaves things in an indeterminate state and the subsequent call to test_file() actually trashes the allocator info. BTW I've sent a pull request with the latest version that will build on v10.3-2021.10 of the compiler. |
If you want to look at this, note that EMMC_DEBUG is disabled by default even in the debug build as it made startup times very long (see line 52 of src/fatfs/sd_card.c) |
I've enabled EMMC_DEBUG and there does seem to be an intermittent failure of a specific command during the SD Card initialization sequence. A good startup looks like:
A bad startup looks like:
At this point the EMMC driver just gives up and bails. The logging around the failed command is:
This is the first command in the initialization sequence which uses data transfer, to transfer a 8-byte block (which includes the SD Card version and data bus width). I've tried a number of things:
None of those things worked... I suspect this is a bug in the EMMC emulation in the Pi 4 which ends up with some data transfer state machine being stuck somewhere. I did spot the malloc bug, which is in sd_card_init(): The emmc_block_dev structure can either be malloced, or passed in
If sd_card_init bails, that memory is always freed:
So you end up freeing memory which was statically allocated, which is a big no-no and is likely to be what's causing the heap corruption. A fix for this in commit: cab728b The only workaround I could think of for the SD Card initialization issue was to retry the whole initialization sequence. That's in a seperate commit: d879a81 What's interesting is the bug only seems to happen once!
There are a couple of improvements that could be made from here:
|
Change-Id: I0f643ea2a7dbdb1cfde3d20a2cf0892bfca1f4ad
Change-Id: I801fe7706795a71b08a9ec80534ca21745124d13
Thanks for that, it looks like Pi4 support is now complete! |
One strange thing I noticed with the Pi4: |
Pi 4 now supports the enhanced CGA artifact emulation |
I though I'd document the remaining issues with Pi 4 support to help in tracking down the required info:
Genlocking doesn't work.
This is due to the HDMI PLL registers moving or changing
Also the interrupt controller has changed so the Vsync flag can't be read (used for genlock and displaying the red sync bar)
Screecaps don't work
lodepng genrates error 55
16bpp display modes don't work properly because the mailbox will only select 5:6:5 RGB and the display list gets modified to switch this to 4:4:4:4 ARGB. The display list and associated registers have moved or changed so that trick no longer works.
This could be worked around by reverting to the original 5:6:5 capture code. The reason for the change was to speed up the capture loops as almost no logical masking and shifting was required for 4:4:4:4 ARGB compared to 5:6:5 RGB.
However due to the new GPU capture code the loss of speed by reverting may not be an issue especially if it is only done for the Pi 4
The PLLs need further experimentation.
I've managed to get it working using PLLD but all the PLLs seem to be in use so this needs to be looked at further as it currently runs the PLL higher than it's default rather than lower (running lower causes the Pi to lose display or lockup.)
Mode7 is glitchy
Mode 7 is glitchy even with GPU capture. However this can be fixed by reducing the capture width by a few pixels (it's already wider than the active screen so nothing is lost).
This is very likely due to higher memory latency than on the other Pi versions.
The text was updated successfully, but these errors were encountered: