Blog-Post-Assignment (Spring 24) #

This is the Github page for the class, Efficient ML Systems (EECE695D-01). Students (will) upload a blog post reviewing the paper.

Guideline for Students #

You toggle the menu and find your paper number. Then, into the page, you can easily find the Edit this page button.
- You edit the markdown file. If you want to attach the image (e.g. pipeline),
  you easily add file using Upload files button:
💡 You can reference TA’s example post 👉 LINK
- Posted how to use the KaTex

Contact #

If you have any questions, please feel free to contact TA (hagyeonglee@postech.ac.kr).

Paper List #

Paper No.	Title	Team
0	*(example)* Neural Image Compression with Text-guided Encoding for both Pixel-level and Perceptual Fidelity	Hagyeong Lee
1	Is Bigger Edit Batch Size Always Better? – An Empirical Study on Model Editing with Llama-3	Jin Hyun, Gyuhyun Jung
2	Spectrally Pruned Gaussian Fields with Neural Compensation	Donggeon Lee, Chiho yoon
3	Unit Scaling: Out-of-the-Box Low-Precision Training	SeongRok Moon, Changyoung Ju
4	Better & Faster Large Language Models via Multi-token Prediction	Jinoh Cho, Seonghyeon Park
5	Lossless Self-Speculative Decoding via Double Early Exiting	Nayoung Kwon, Jiwoong Im
6	XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference	Hyundong Kim, Sangil Han
7	VeRA: Vector-based Random Matrix Adaptation	Kyumin Cho, Sejin Park
8	Mixture of LoRA Experts	Jegwang Ryu, Sangbeom Ha
9	MobileNetV4 – Universal Models for the Mobile Ecosystem	JoonSeok Kim, DongGyu Kim
10	Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length	Hyunho Kook
11	Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention	Younghyun Cho, Sangjun Lee
12	Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies	Junkyeong Park, Harit Keawmuang
13	A Large-Scale Exploration of μ-Transfer	Jeonghyun Choi, Minhye Choo
14	BinaryDM: Towards Accurate Binarization of Diffusion Model	Junhyuk So, Juncheol Shin
15	Training LLMs over Neurally Compressed Text	Seonghyun Park, Jiwoo Kim
16	Mixture-of-Depths: Dynamically allocating compute in transformer-based language models	Minjae Park, Inkwan Hwang
17	QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs	MyeongJi Yun, Jung Gyu Min
18	ViTAR: Vision Transformer with Any Resolution	Jungwon Lee, Minsang Seok
19	LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning	Sungbin Shin, Dongyeop Lee
20	Evolutionary Optimization of Model Merging Recipes	Youngkil Song, Jaehyeon Park
21	A Unified Framework for Model Editing	Jonghyun Chae, Donggeun An
22	Larimar: Large Language Models with Episodic Memory Control	Sunggyu Jang, Hyeonwoo Park
23	Beyond Language Models: Byte Models are Digital World Simulators	Dohun Kim, Yeongwoo Kim
24	LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression	Seungjoo Shin, Sua Choi
25	Merging Text Transformer Models from Different Initializations	Minwoo Kim, Kyungtae Kim