unsafe.host
currently serving 11 threads and 45 posts

unsafe.host is a simple discussion forum. it's really quite safe!

rules:

you may use a tripcode by adding #your_tripcode to your name.
adding ~sage to your name will prevent your reply from bumping the thread.
timestamps are rendered using .beat time.

you may use the following emojis:
:baka: :blessed: :cat: :dead: :depressed: :disgust: :gun: :happy-claude: :hehe: :hmm: :imfine: :kek: :kms: :love: :mikushock: :nobully: :o_o: :omega-lul: :oof: :pet: :pout: :puddle: :sad-cowboy: :smug: :sob: :spray: :this:

there have been a total of 11 threads and 45 posts on this site to date.

[toggle dark mode]

link sharing thread: ml/ai edition anonymous 2025-05-19@830 No.31

let's share some cool ml/ai papers and other resources!
here's a start:

https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/ (The 37 Implementation Details of Proximal Policy Optimization)
https://arxiv.org/pdf/2101.03961 (Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity)
https://arxiv.org/abs/2505.07215 (Measuring General Intelligence with Generated Games)
https://arxiv.org/html/2504.20571v1 (Reinforcement Learning for Reasoning in Large Language Models with One Training Example)

anonymous 2025-05-19@831 No.32

https://storage.googleapis.com/public-technical-paper/INTELLECT_2_Technical_Report.pdf (INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning)
https://howtoscalenn.github.io/ (How To Scale)
https://rlhfbook.com/ (Reinforcement Learning from Human Feedback: A short introduction to RLHF and post-training focused on language models)

anonymous 2025-05-19@833 No.33

https://arxiv.org/abs/2503.01067 (All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning)
https://gorilla.cs.berkeley.edu/blogs/13_bfcl_v3_multi_turn.html (BFCL V3 • Multi-Turn & Multi-Step Function Calling Evaluation)
https://arxiv.org/abs/2505.03335 (Absolute Zero: Reinforced Self-play Reasoning with Zero Data)
https://arxiv.org/pdf/2109.08668 (Primer: Searching for Efficient Transformers for Language Modeling)
https://ofir.io/How-to-Build-Good-Language-Modeling-Benchmarks/ (How to Build Good Language Modeling Benchmarks)