Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
1
1
James K J
james92
Follow
yehors-cv's profile picture
1 follower
ยท
14 following
s1782662
AI & ML interests
Reinforcement Learning, Computer Vision
Recent Activity
liked
a model
about 5 hours ago
BAAI/bge-small-en-v1.5
upvoted
a
collection
4 months ago
Journal Club
replied
to
Jaward
's
post
9 months ago
After giving GPU Programming a hands-on try, I have come to appreciate the level of complexity in AI compute: - Existing/leading frameworks (CUDA, OpenCL, DSLs, even Triton), still fall at the mercy of low-level compute that requires deeper understanding and experience. - Ambiguous optimizations methods that will literally drive you mad ๐คฏ - Triton is cool but not cool enough (high level abstractions that fall back to low level compute issues as you build more specialized kernels) - As for CUDA, optimization requires considering all major components of the GPU (DRAM, SRAM, ALUs) ๐ค - Models today require stallion written GPU kernels to reduce storage and compute cost. - GPTQ was a big save ๐๐ผ @karpathy is right expertise in this area is scarce and the reason is quite obvious - uncertainties: we are still struggling to get peak performance from multi-connected GPUs while maintaining precision and reducing cost. May the Scaling Laws favor us lol.
View all activity
Organizations
None yet
models
2
Sort:ย Recently updated
james92/lora
Updated
Dec 20, 2023
โข
12
james92/bloom7b__finetune_sample
Updated
Dec 4, 2023
โข
6
datasets
None public yet