You may have to use the gpu_memory_limit and/or lora_on_cpu config solutions to prevent functioning out of memory. If you continue to run from CUDA memory, you may attempt to merge in program RAM with
Posted in博客 https://barbaraukdv304997.wikimeglio.com/user