Spring 2024

Blog-Post-Assignment (Spring 24) #

This is the Github page for the class, Efficient ML Systems (EECE695D-01). Students (will) upload a blog post reviewing the paper.

Guideline for Students #

  • You toggle the menu and find your paper number. Then, into the page, you can easily find the Edit this page button.

    • You edit the markdown file. If you want to attach the image (e.g. pipeline),
      you easily add file using Upload files button: .

    πŸ’‘ You can reference TA’s example post πŸ‘‰ LINK

    • Posted how to use the KaTex

Contact #

If you have any questions, please feel free to contact TA (hagyeonglee@postech.ac.kr).

Paper List #

Paper No. Title Team
0 (example) Neural Image Compression with Text-guided Encoding for both Pixel-level and Perceptual Fidelity Hagyeong Lee
1 Is Bigger Edit Batch Size Always Better? – An Empirical Study on Model Editing with Llama-3 Jin Hyun, Gyuhyun Jung
2 Spectrally Pruned Gaussian Fields with Neural Compensation Donggeon Lee, Chiho yoon
3 Unit Scaling: Out-of-the-Box Low-Precision Training SeongRok Moon, Changyoung Ju
4 Better & Faster Large Language Models via Multi-token Prediction Jinoh Cho, Seonghyeon Park
5 Lossless Self-Speculative Decoding via Double Early Exiting Nayoung Kwon, Jiwoong Im
6 XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference Hyundong Kim, Sangil Han
7 VeRA: Vector-based Random Matrix Adaptation Kyumin Cho, Sejin Park
8 Mixture of LoRA Experts Jegwang Ryu, Sangbeom Ha
9 MobileNetV4 – Universal Models for the Mobile Ecosystem JoonSeok Kim, DongGyu Kim
10 Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length Hyunho Kook
11 Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Younghyun Cho, Sangjun Lee
12 Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies Junkyeong Park, Harit Keawmuang
13 A Large-Scale Exploration of ΞΌ-Transfer Jeonghyun Choi, Minhye Choo
14 BinaryDM: Towards Accurate Binarization of Diffusion Model Junhyuk So, Juncheol Shin
15 Training LLMs over Neurally Compressed Text Seonghyun Park, Jiwoo Kim
16 Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Minjae Park, Inkwan Hwang
17 QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs MyeongJi Yun, Jung Gyu Min
18 ViTAR: Vision Transformer with Any Resolution Jungwon Lee, Minsang Seok
19 LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning Sungbin Shin, Dongyeop Lee
20 Evolutionary Optimization of Model Merging Recipes Youngkil Song, Jaehyeon Park
21 A Unified Framework for Model Editing Jonghyun Chae, Donggeun An
22 Larimar: Large Language Models with Episodic Memory Control Sunggyu Jang, Hyeonwoo Park
23 Beyond Language Models: Byte Models are Digital World Simulators Dohun Kim, Yeongwoo Kim
24 LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression Seungjoo Shin, Sua Choi
25 Merging Text Transformer Models from Different Initializations Minwoo Kim, Kyungtae Kim