Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some stm32_fsdev optimizations #2178

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from
Draft

Conversation

Rocky04
Copy link
Contributor

@Rocky04 Rocky04 commented Jul 24, 2023

As requested, the suggested changes of the discussion from #2174

  • Skip interrupt handler if there is nothing to do.
  • Use the SOF as a failsafe to restore from sleep if the host don't trigger the remote wakeup.
  • Drop pending interrupts which were skipped if a reset took place.
  • Only suspend device if there is no pending remote wakeup.

@@ -656,7 +660,14 @@ void dcd_int_handler(uint8_t rhport) {

(void) rhport;

uint32_t int_status = USB->ISTR;
// Only handle interrupts which are triggered and currently active
uint32_t int_status = USB->ISTR & USB->CNTR;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR, please use an intermediate variable, otherwise IAR will emit warning since the order of 2 volatile read is not clear.

Copy link
Contributor Author

@Rocky04 Rocky04 Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? I don't get why this matters. Sounds ridicules to me to use:

uint32_t int_status = USB->ISTR;
int_status &= USB->CNTR;

I don't get why a compiler should throw a warning here, the order shouldn't be defined by C either and the compiler is free to reorder instructions anyway. The order which register is read first here also don't matter.
To be clear, C defines the operator precedence but as far as I know the compiler is free to decide when the independent parts are handled. Like if A or B is loaded first here C = A + B;

Regardless I changed it...

To be clear, I get why this is may useful for some circumstances, like when the same volatile is used to specify the address multiple times for a single statement (like a[volatile] = b[volatile];) , but if these are fully independent and different addresses anyway I would expect the compiler should be clever enough to notice that it's fine.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? I don't get why this matters. Sounds ridicules to me to use:

That's definitely is not non-sense

I don't get why a compiler should throw a warning here, the order shouldn't be defined by C either and the compiler is free to reorder instructions anyway. The order which register is read first here also don't matter.

The order of register access is important in some cases, register read has side effects such as clear/set bits.

image

Copy link
Contributor Author

@Rocky04 Rocky04 Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be really nice if you can provide an example where this matters (of course under the mentioned assumption that the address isn't the same). Sadly showing that operations can have side effects don't really add any value to this discussion.

Would be possible in the case a register has side effects to another one so that they are dependent to each other, but I'm not aware of such one.

Would be really appreciated to be made aware of such a potential coding flaw.

Edit:
So I fully agree that the compiler should emit a warning here because the same volatile is accessed multiple times in a single statement:
int a = *(volatile b) + *(volatile b) * 2;

But not here when they are referring to different addresses:
int a = *(volatile b) + *(volatile c) * 2;

Copy link
Contributor Author

@Rocky04 Rocky04 Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I guess you are right and it can cause issues for some very rare things. Like timing related things in a communication mesh. For example when different SPI secondaries need to be accessed in a very specific order.

false whitespace
- Removing a suppression about an unused parameter because it's used now
- Pass the port parameter down to called functions in the interrupt handler instead of using a const port
- Consistent code style in the interrupt handler
- Forgot the inversion when manually copying the code over :-(
@Rocky04 Rocky04 changed the title Some optimizations Some stm32_fsdev optimizations Aug 8, 2023
@HiFiPhile
Copy link
Collaborator

I can rebase and merge 3 commits:

  • Use the SOF as a failsafe to restore from sleep if the host don't trigger the remote wakeup.
  • Drop pending interrupts which were skipped if a reset took place.
  • Only suspend device if there is no pending remote wakeup.

For Skip interrupt handler if there is nothing to do. as I said if no irq is active dcd_int_handler() won't enter.

USB->CNTR &= ~USB_CNTR_FSUSP;

/* Disable SOF if it wasn't enabled to begin with */
dcd_sof_enable(rhport, previousSofState);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an issue here, I think.

The call dcd_sof_enable(rhport, true); on line 736 sets previousSofState = true. But we expect to call dcd_sof_enable(rhport, false) here if SOF is used as a failsafe to wake the device?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. At 736 either the value should not overwritten or it should be restored.

@Rocky04 Rocky04 marked this pull request as draft May 13, 2024 21:06
@Rocky04
Copy link
Contributor Author

Rocky04 commented May 13, 2024

Sorry, I need to revaluate this one.

The problems where this seemed helpful was to address an issue where the device stuck in the sleep mode. But the issue lied somewhere else. If a DCD_EVENT_BUS_RESET event occurs while on sleep (after tud_suspend_cb() was called) the internal device state gets wiped but the application don't get informed to be awake nor to be in the unconfigured state again. Not a single callback is triggered. And if the device recovers just depends on how their power state handling is written.

Edit:
I still need to push a PR for that fix.

@HiFiPhile
Copy link
Collaborator

If a DCD_EVENT_BUS_RESET event occurs while on sleep (after tud_suspend_cb() was called) the internal device state gets wiped but the application don't get informed to be awake nor to be in the unconfigured state again. Not a single callback is triggered.

Do you have a test case ? I'll take a look.

@Rocky04

This comment was marked as off-topic.

@Rocky04

This comment was marked as off-topic.

@HiFiPhile
Copy link
Collaborator

HiFiPhile commented May 14, 2024

Which device your are testing ?

I've tested G0B1 and L053, both of them resume correctly when reset happens while suspended with XHSETT.

USBD Suspend : Remote Wakeup = 0
USBD Resume
USBD Bus Reset : Full Speed

image

@Rocky04

This comment was marked as off-topic.

@HiFiPhile
Copy link
Collaborator

As mentioned please don't focus on the resume here, better check if tud_umount_cb() is called for a USB bus reset. You should also see that on the shown LED blink rate.

Yes we could do better for tud_umount_cb().

I mainly work with GD32 chips, especially GD32F303 and GD32F425.

That's what I asked for test case, instead of spamming half page.

These chips are not supported, if you need them please make a proper port and make these changes GD32 specific.

@Rocky04

This comment was marked as off-topic.

@HiFiPhile
Copy link
Collaborator

HiFiPhile commented May 15, 2024

GD32 chips are already ported.

The only GD32 supported is GD32VF103:

#define OPT_MCU_GD32VF103 1600 ///< GigaDevice GD32VF103

Compile to other target with a define OPT_MCU_STM32L0 is a hack.

@Rocky04
Copy link
Contributor Author

Rocky04 commented May 15, 2024

I didn't wrote that I'm using a port which is provided by this repro. 😉
I also don't get why this matters. This hasn't anything to do anymore with this PR.

Again, I need to revaluate this one (which will likely take a while) and I will create another one to address the suppressed state change callbacks (which I hope I can push this week).

@HiFiPhile
Copy link
Collaborator

I didn't wrote that I'm using a port which is provided by this repro. 😉
I also don't get why this matters. This hasn't anything to do anymore with this PR.

I don't understand why didn't you put the fix in your port since it's a GD32 specific issue, and this driver is for STM32.

@HiFiPhile HiFiPhile closed this May 15, 2024
@Rocky04
Copy link
Contributor Author

Rocky04 commented May 15, 2024

What is your problem??? I wrote that I mainly work on GD32 chips. This doesn't mean that I don't have a board which uses this driver! For example I have a STM32F3DISCOVERY as a reference and back in the day I also encountered issues with that. But as I wrote I need to revaluate the need of this PR because of the other fixes I added. As mentioned I don't mainly work with this chip which means I need to test it again which takes time.

@wronex
Copy link

wronex commented May 15, 2024

You seem like two very knowledgeable gentlemen. Perhaps you can answer my very simple questions and bring this conversation around in a more cordial direction.

I was interested in this pull-request because my STM32F042F6PX device is getting stuck in suspend. It only happens a few times per week and I haven't been able to capture the reason. This pull-request seemed like a good way try something new. And it almost worked. Or perhaps I'm imagining. It has only happened once since. The device still gets stuck though. And Windows thinks it awake while the device thinks it should sleep.

How is suspend communicated to the device? Is it a special packet? Is it a special signal voltage combination?

@HiFiPhile
Copy link
Collaborator

For example I have a STM32F3DISCOVERY as a reference and back in the day I also encountered issues with that.

It would be nice if you state STM32F3 is affected instead of direct to a unsupported chip plus a port.

@wronex Please provide more information, there could other reasons that device stops responding.

How is suspend communicated to the device? Is it a special packet? Is it a special signal voltage combination?

Devices can go into the Suspend state from any powered state.
They begin the transition to the Suspend state after they see a constant Idle state on their upstream facing bus
lines for more than 3.0 ms.

Host sends SOF packets periodically with a interval of 1ms for FS and 0.125ms for HS, device go to suspend state when it stops.

@Rocky04
Copy link
Contributor Author

Rocky04 commented May 16, 2024

@HiFiPhile
I don't know how many times is need to mention again that I need to revaluate it before I can say if it's affected or not. Without doing some (cross) checks I can't tell because at this point I don't know that anymore. I set this PR back as a draft and didn't close it for a reason. I had various issues with STM32F3 in the past. In the end a lot of my problems where related how usbd handles the events. But I also had issues where I think that these were related to the dcd_stm32_fsdev driver. There are so many USB hubs out there which behave completely different. It's maybe fine for you that there are compatibility issues with some devices but I would like to address these.

So cut some slack and just give me some time. It's almost a year since I initially created this PR. It seems crazy to me that you first seemed to blindly accept this PR while now making a lot of fuss out of nowhere and after I set it back as a draft.

@wronex
Your setup would be interesting? Like how is the device connected to the host, in the case a hub is used which is it and if it's used with an external power supply or not? When did it happen and what happened before? Like have you put the host in the hybernation or into energy saving mode? Did the device tried to wake the host?

I'm also interested why you know that the host thinks the device is active while the device thinks it's still suspended. I assume the USB core itself isn't in the low power mode and that the underlying TinyUSB device task still handles requests. Otherwise the host would not be able to communicate with the device and mark the device as unresponsive.

A device should enter the suspend state when there is no activity on the bus for more than 3 ms. Meaning no signal change (only D+ high for Full-Speed, only D- high for Low-Speed). Normally there is at least a start of frame (SOF) packet each millisecond causing a signal change. While the device is suspended, as soon as there is a signal change this driver should notice that and schedules a resume event which wake up the device. (This isn't great in my option because it can lead to problems.) If allowed (the device needs to offer the support in the device descriptor and the host needs to enable it via a request) the device can also wake up the host, for that it's inverting the signal state (so only D- high for Full-Speed or D+ high for Low-Speed). There are two levels, the driver itself handles the USB core power state depending if it's suspended or not and on a higher level the app gets informed by TinyUSB callbacks about the current state to be able to control other device functions accordingly.

@wronex
Copy link

wronex commented May 17, 2024

Thank you both for the information. I will attempt to capture more details about what happens.

My setup is a STM32F042F6P6TR that acts as a USB<->UART bridge using HID and WinUSB. USB only powers the microprocessor and a LED, the rest of the system is optically isolated. The device is connected with a USB-C cable to a no-name 10-port USB-hub with external power. Being a 10-port HUB it has lots of other peripherals that are connected and working as expected.

I can force the device into suspend by disconnecting the hub from the PC. The device successfully enters suspend, and successfully resumes when the hub is reconnected. It can also be forced into suspend by sleeping the PC. This also works as expected.

I've been relying on USB Device Tree to check the power state. This is perhaps not good enough?

Problems arise when the device randomly enters suspend while the PC is awake. USB Device Tree reports power state D0 (awake.) But any attempt to access the device with WinUSB results in "Device has malfunctioned".

I don't know if this is an issue with the hub (as USB hubs tend to by unreliable.) But all other devices connected to the same hub works as expected. If this happens on my setup it is bound to happen for others as well.

It would be acceptable to detect this situation and reboot the microprocessor.

Thank you for your help!

@HiFiPhile
Copy link
Collaborator

HiFiPhile commented May 17, 2024

From stm32f0 errata sheet I found something suspicious:

ESOF interrupt timing desynchronized after resume signaling
Description
Upon signaling resume, the device is expected to allow full 3 ms of time to the host or hub for sending the
initial SOF (start of frame) packet, without triggering SUSP interrupt. However, the device only allows two full
milliseconds and unduly triggers SUSP interrupt if it receives the initial packet within the third millisecond.
Workaround
When the device initiates resume (remote wakeup), mask the SUSP interrupt by setting the SUSPM bit for 3 ms,
then unmask it by clearing SUSPM.

I've setup a test based on hid_generic_inout example with stm32l053 which read/write device every 10s and let OS put it to suspend state, so far after few hours on issue observed.


@Rocky04
I apologises for closing this draft PR after some misunderstood, now I understand you work on a ST device and needs some time to investigate.

At the same time please mind the way of your communication as within few exchanges you've already have glitches with all mainteners involved:

From your interrogation style self introduction:
#2174

Then massive spamming:
#2180
#2220

Also language like I'm still irritated why you think a vendor driver isn't controlled by usbd.

We are contributing freely with our free time and not your employee.

@HiFiPhile HiFiPhile reopened this May 17, 2024
@Rocky04
Copy link
Contributor Author

Rocky04 commented May 17, 2024

Really??? Still blaming and accusations... What is this place? I thought contribution would be welcome. I also do this here in my free time and bought the devices myself where I heard user have issues with.

I would very appreciate if you would just tell me the next time when you have trouble understanding me and in return I will ask you to rephrase the question in a more detail manner instead of "spamming" in order to try to answer it in every possible way, when that wasn't clear to me.

The note you mentioned is interesting. Before I encountered an issue which sounds like that. I had the problem that when the device enabled the wake-up signal that a suspend interrupt can occur while the signal is still hold. Due to that the USB core entered the low power state and the ESOF interrupt didn't triggered anymore. Because of that the device were stuck in a dead lock messing up the signaling. Sadly I wasn't able to replicate that issue yesterday on a quick check. As far as I remember I was able to replicate that problem before with a GD32 as well as a STM32.

@hathach
Copy link
Owner

hathach commented May 20, 2024

hi hi, thank you everyone for contributing, we are all busy with other fulltime works and is working on this on our own free time without paid. The main reason for most of us is having fun and helping each other.

There is lots of stm32 clone: gd32, ch32 to name a few, they mixed cpu core and usb core and they also use their own usb core in additionn as well. It is nonrmal to assume they behave differently. Also, not everyone speak Enlighs natively, me as an example, I don't use English often in my normal life, and I often skip lots of text while reading and misunderstand/misread others quite often.

All in all, there is no point to exaggerate or get ourself into more arguments than we are already in our normal life. Let keep this a fun thing to do in our spare time.

PS: I am way behind any schedules I got, and have tons of pending PRs (for months and years), therefore I often skipped checking draft PR like this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants