BitNet

community

AI & ML interests

None defined yet.

Recent Activity

buaahsh authored a paper 10 days ago

Language Is Not All You Need: Aligning Perception with Language Models

buaahsh authored a paper 10 days ago

UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation

buaahsh authored a paper 10 days ago

Adapting Large Language Models via Reading Comprehension

View all activity

buaahsh

authored 20 papers 10 days ago

Language Is Not All You Need: Aligning Perception with Language Models

Paper • 2302.14045 • Published Feb 27, 2023

UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation

Paper • 2303.08518 • Published Mar 15, 2023

Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 81

Kosmos-G: Generating Images in Context with Multimodal Large Language Models

Paper • 2310.02992 • Published Oct 4, 2023 • 4

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Paper • 1912.13318 • Published Dec 31, 2019 • 4

Democratizing Reasoning Ability: Tailored Learning from Large Language Model

Paper • 2310.13332 • Published Oct 20, 2023 • 16

DocBank: A Benchmark Dataset for Document Layout Analysis

Paper • 2006.01038 • Published Jun 1, 2020

MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers

Paper • 2012.15828 • Published Dec 31, 2020 • 1

Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment

Paper • 2106.06381 • Published Jun 11, 2021

DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

Paper • 2106.13736 • Published Jun 25, 2021

Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training

Paper • 2109.07306 • Published Sep 15, 2021

Scaling Sentence Embeddings with Large Language Models

Paper • 2307.16645 • Published Jul 31, 2023

PromptBERT: Improving BERT Sentence Embeddings with Prompts

Paper • 2201.04337 • Published Jan 12, 2022

DeepNet: Scaling Transformers to 1,000 Layers

Paper • 2203.00555 • Published Mar 1, 2022 • 2

$Se^2$: Sequential Example Selection for In-Context Learning

Paper • 2402.13874 • Published Feb 21, 2024 • 1

Text Diffusion with Reinforced Conditioning

Paper • 2402.14843 • Published Feb 19, 2024

On the Representation Collapse of Sparse Mixture of Experts

Paper • 2204.09179 • Published Apr 20, 2022 • 1

ResLoRA: Identity Residual Mapping in Low-Rank Adaption

Paper • 2402.18039 • Published Feb 28, 2024 • 11

Task-Specific Expert Pruning for Sparse Mixture-of-Experts

Paper • 2206.00277 • Published Jun 1, 2022 • 1

Language Models are General-Purpose Interfaces

Paper • 2206.06336 • Published Jun 13, 2022 • 1