Submitted by Richard ZHou 69 Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps RTP-LLM 1