Jaward Sesay's picture

Jaward Sesay

Jaward

AI & ML interests

I like to train large deep neural nets too 🧠🤖💥 | First Paper (AutoAgents: A Framework for Automatic Agent Generation) Accepted @ IJCAI 2024 | Role Model Karpathy

Recent Activity

Articles

Organizations

MLX Community's profile picture

Jaward's activity

posted an update 7 days ago
replied to their post 14 days ago
view reply

btw the background songs in the videos are actually what I listen to during implementation

posted an update 14 days ago
posted an update 25 days ago
view post
Post
591
In Honour of This Year's NeurIPs Test of Time Paper Awardees
This year's NIPs Test of Time Paper Awards went to two groundbreaking papers:
1. Generative Adversarial Nets (Goodfellow et al)
2. Sequence to Sequence Learning with Neural Networks (Ilya et al)
Let's explore how these papers helped pioneered breakthroughs in today's AI:

Full Article: https://huggingface.co/blog/Jaward/nip
published an article 25 days ago
view article
Article

In Honour of This Year's NeurIPs Test of Time Paper Awardees

By Jaward
2
posted an update 26 days ago
view post
Post
636
Lightweight implementation of the seminal paper “Sequence to Sequence Learning with Neural Networks”

Built, trained and eval a 2 layer deep seq2seq LSTM-based model (~10M params) on German-English corpus of Multi30K dataset. In honor of
ilya sutskever et al for winning this year’s NeurIPSConf Test of Time paper award 🫡

Code: https://github.com/Jaykef/ai-algorithms/blob/main/seq2seq.ipynb
posted an update about 1 month ago
view post
Post
479
Rethinking Backpropagation: Thoughts on What's Wrong with Backpropagation

As a young researcher, I've often pondered the limitations of backpropagation, especially when mapped with how learning occurs in the human brain. While backpropagation has been the workhorse of deep learning, it isn't without flaws. In this post, I aim to share some thoughts on these shortcomings from first principles.

Full article
https://huggingface.co/blog/Jaward/rethinking-backpropagation
posted an update about 1 month ago
view post
Post
2418
Implements compute-efficient DeepPCR algorithm which parallelizes sequential operations thus speeding up inference and training of neural networks. DeepPCR can significantly reduce the time complexity in operations such as denoising in latent diffusion space from O(L) to O(log2 L).

Code: https://github.com/Jaykef/ai-algorithms/blob/main/deep_pcr.ipynb
posted an update about 1 month ago
posted an update about 2 months ago
liked a Space about 2 months ago
posted an update about 2 months ago
view post
Post
1737
Interesting Work on Reasoning 🤔
- explores a new take on few-shot reasoning while challenging assumptions that program synthesis is necessary for abstract reasoning.
- shows test-time training + smart inference tricks can match human-average performance, though at high computational cost. Key insight: proper compute allocation matters more than method (whether symbolic or neural).

Paper: https://ekinakyurek.github.io/papers/ttt.pdf
posted an update 2 months ago
view post
Post
2105
It's work like this that in some way signal the eventual “dominance” of AI over all the sciences.

“We train our model on the six-dimensional N-body phase space, predicting particle velocities as the time derivative of the model’s displacement outputs”

The emulator is capable of predicting
the nonlinear displacement and velocity fields for 128^3 particles in half a second on a single GPU🤯
  • 1 reply
·