Multimode fiber endoscopes provide extreme miniaturization of imaging components for minimally invasive deep tissue imaging. Typically, such fiber systems suffer from low spatial resolution and long measurement time. Fast super-resolution imaging through a multimode fiber has been achieved by using computational optimization algorithms with hand-picked priors. However, machine learning reconstruction approaches offer the promise of better priors, but require large training datasets and therefore long and unpractical pre-calibration time. Here we report a method of multimode fiber imaging based on unsupervised learning with untrained neural networks. The proposed approach solves the ill-posed inverse problem by not relying on any pre-training process. We have demonstrated both theoretically and experimentally that untrained neural networks enhance the imaging quality and provide sub-diffraction spatial resolution of the multimode fiber imaging system.