Discussion about this post

User's avatar
Neural Foundry's avatar

The personalized hard negative sampling approach is clever. Using completion rates to idnetify titles users abandoned is way more informative than random negatives. I'd be curious how they handle edge cases where users drop off for non content reasons like interuptions. The MoE architecture makng sense here too, especialy with the adaptive gating between short term and mid term experts.

Expand full comment
Rainbow Roxy's avatar

Thanks for writing this it clarifies a lot. What if S-MoE's adaptive gate misroutes user behavior?

Expand full comment

No posts

Ready for more?