hverkuil: Kiitos!
kbingham: hi, I've just read your email about "Fault tolerant V4L2"
kbingham: IIUC, you may be interested in this patch series https://lkml.org/lkml/2018/9/4/376
javier__, Ohh - having read the subject line - yes that does sound interesting :)
kbingham: because AFAIK what you want is to get rid of the rvin_parallel_notify_complete() callback
since that will only be called if there's no pending subdevices to be probed async
neg, pinchartl, jmondi would be interested too :)
javier__: thanks for pointing out this series, I missed it. I replied to it.
hverkuil: thanks for your feedback. It's funny because I actually agree with you and my first proposal was https://patchwork.kernel.org/patch/10584087/
but was nacked by sailus. So if we will expose the media graph before it's properly initialized, then at least it has to be more robust
javier__: In the past what we looked at for rcar-vin is to register the vdev at probe time and not from the async complete callback. There have been some different views when this have been discussed in the past so maybe it's to to resume that discussion :-)
I see no point in only doing partial initialization unless we have a way of communicating to userspace what has failed. It's great to be fault tolerant, but userspace needs to know how faulty it really is and whether it is safe to continue or not.
hverkuil: Agreed. btw, this use case is for a 2-in-1 tablet/laptop that has a front and back cameras, so I understand sailus point too
javier__: thanks for the pointer!
If we implement this, then only after the userspace aspects are addressed. I'll nack it otherwise.
hverkuil: my point is that we should either only register the media device if everything is initialized (i.e: my first patch) or expose as much as possible if something fails (i.e: my second patch-series)
Right now it is easy: if it comes up, then everything is working.
hverkuil: don't yout think userspace knows if it can continue or not by just inspecting the media graph? If, quoting your example in the mail reply, 9 cameras are fine and 8 are not, won't userspace know by just inspecting the media graph?
hverkuil: great, could you please comment on https://patchwork.kernel.org/patch/10584087/ then?
since right now I have the worst of both worlds, a media graph exposed with links not created for the registered subdevices
(if one of them fails)
commented on the patch. And I agree, it's broken (out-of-spec) right now.
hverkuil: great, thanks!
jmondi: how would applications know something is missing? Then they would have to have an internal copy of the topology as it should be in their own code.
kbingham, jmondi: btw, rcar-vin has the same issue. It registers the media device too early before the subdevices are registered async
instead should be done at the end of the rvin_parallel_notify_complete() callback
BTW, I understand that it would be easy in a simple case like a front and back camera, but this is really a simple example of a much bigger problem. I'd like to see this fixed properly and not by hacking something into the code.
Old discussion: https://patchwork.kernel.org/patch/9849317/
hverkuil: I might be missing a piece of the picture, but if an application knows that it needs subdev-X and subdev-Y, but could work without subdev-Z, it does not need to make sure "all HW components have successfully probed" (and decide that inspecting a copy of the expected media-graph). I'll read the old discussion again before commenting further though
jmondi: perhaps. But someone needs to make a proper proposal first, thinking through all the various use-cases that we know of (or can think of).
hverkuil: thing is that it seems to me that this kind of decisions ("can I continue or not with just X subdevs registered") is a userspace decision, and trying to push it down to the kernel we'll never find any solution that works for everyone :)
I'm sure there will be occasion to discuss that further in a month or so
There are two aspects: right now apps expect that devices only appear if the HW is fully initialized and working. That's what you want most of the time anyway.
So allowing for partial initialization means that this should be somehow enabled explicitly (see old discussion). That's one problem.
The other is that once this is enables, how do you tell userspace what is and what is not initialized. Should we use time outs? What if something breaks after it has been initialized? In other words, how is this supposed to work as seen from userspace?
I want to see an RFC for this.
hverkuil: I've thought that we could add some MEDIA_EVENTS (like we have for V4L2) notifying users-pace if it was complete, how many subdevices were expecting, etc
javier__: Yes, but please wait a bit.
That's WiP now.
javier__: quite possibly. Again, take a day to think about this and make a proposal. I have a gut feeling that this doesn't have to be difficult implementation wise, but we need a proper plan.
hverkuil: I think sailus have a plan for this
*has
hverkuil: for now I'm OK with no exposing the media graph until all the subdevices have been fully registered
there's a ton of work to support the complex cameras before thinking about the case of having a partial working media graph
And I'm back ...
javier__, Will you be at ELCE (/#media summit day?)
kbingham: unfortaunately I won't, but Wim Taymans (GStreamer / PipeWire) will be there if you want to discuss complex cameras support with him
javier__, Ok no worries. Are you going to any other confs?  LPC ?
*unfortunately, I can't type today
kbingham: not as well, I won't attend more confs this year
pinchartl: sailus: neg: Anyone around for a quick question on v4l2/mux series?
jmondi: Go ahead.
sailus: Thanks Sakari, I'm trying to understand the graph walking part, specifically the media_graph_walk_iter() functions
and I was wondering if links get pushed to stack and inspected twice, in order to properly comment on "[PATCH 10/30] media: entity: Use routing information during graph traversal"
That's not supposed to happen.
Can it?
I'm trying to print out the stack push/pop operations, but there are a lot of them on rcar-vin
sailus: that's what I was trying to find out :)
I think it's been at least one and half years since I wrote the code. Probably two.
I don't remember that much.
it's probably the check that assigns "local" and "remote" in that patch that confuses me
Except I think that I probably thought about that at the time.
sailus: eheh, I understand, I'll keep trying to printout the stack though
The local vs. remote is there since as a link is evaluated, there's a need to find which of the pads is the remote one, in order to follow the link in the case routing to the local pad is enabled.
That's my recollection without reading the code throroughly.
What I do know is that the graph traversal code cannot properly handle loops.
But that's been always the case.
sailus: yes, I was looking at the order remote and local are passed to the has_route operation, and if that 'local' vs 'remote' check is actually required, or if the local pad is not the sink, we can skip the check completely. Then we'll meet again the link later as the source pad will be put on the stack anyway... That's why I wanted to know if it is possible to have a link on the stack twice...
but I'm probably wrong :)
The graph traversal needs to follow enabled links whether they're sinks or sources.
I would like some more information about v4l2_file_operations (kernel space). How are the frame transmitted to userland? Which functions should be implemented and what should they do?
benny-innofaith: https://www.linuxtv.org/downloads/v4l-dvb-apis-new/uapi/v4l/v4l2.html
To implement buffer handling new drivers should use the vb2 framework. include/media/videobuf2-core.h
hverkuil: Hi Hans! Do we have any hope of getting the jpeg encoqder into v4.20?
ezequielg: I haven't had time for patch processing, but I plan to do another round tomorrow/Monday.