Zecheng Tang's picture

4 15 8

Zecheng Tang

ZetangForward

·

https://zetangforward.github.io/

ZetangForward

AI & ML interests

Natural Language Processing, Multimodal Models, Pre-trained Language Models

Recent Activity

authored a paper about 1 month ago

LOGO -- Long cOntext aliGnment via efficient preference Optimization

upvoted a paper about 1 month ago

Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch

upvoted a paper about 1 month ago

LOGO -- Long cOntext aliGnment via efficient preference Optimization

View all activity

Organizations

ZetangForward's activity

upvoted 2 papers about 1 month ago

Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch

Paper • 2410.18693 • Published Oct 24 • 40

LOGO -- Long cOntext aliGnment via efficient preference Optimization

Paper • 2410.18533 • Published Oct 24 • 42

upvoted 2 papers about 2 months ago

Differential Transformer

Paper • 2410.05258 • Published Oct 7 • 166

L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding?

Paper • 2410.02115 • Published Oct 3 • 10

upvoted a paper 8 months ago

LLoCO: Learning Long Contexts Offline

Paper • 2404.07979 • Published Apr 11 • 20

upvoted 2 papers 10 months ago

Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens

Paper • 2401.17377 • Published Jan 30 • 34

StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis

Paper • 2401.17093 • Published Jan 30 • 19

upvoted 8 papers about 1 year ago

Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers

Paper • 2311.10642 • Published Nov 17, 2023 • 23

Trusted Source Alignment in Large Language Models

Paper • 2311.06697 • Published Nov 12, 2023 • 10

CodeFusion: A Pre-trained Diffusion Model for Code Generation

Paper • 2310.17680 • Published Oct 26, 2023 • 69

Teaching Language Models to Self-Improve through Interactive Demonstrations

Paper • 2310.13522 • Published Oct 20, 2023 • 11

Small-scale proxies for large-scale Transformer training instabilities

Paper • 2309.14322 • Published Sep 25, 2023 • 19

Multimodal Foundation Models: From Specialists to General-Purpose Assistants

Paper • 2309.10020 • Published Sep 18, 2023 • 40

OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch

Paper • 2309.10706 • Published Sep 19, 2023 • 16

LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language Models

Paper • 2309.09506 • Published Sep 18, 2023 • 14