アプリとサービスのすすめ

アプリやIT系のサービスを中心に書いていきます。たまに副業やビジネス関係の情報なども気ままにつづります

Good and bad example with MiDaS Depth Estimation (experiment)

I examined why MiDaS can work well or not, depending on images.
If Midas work well, there has to be specific patterns. Also there has to be bad patterns either.


Then, in order to confirm there would be better and bad pattern for MiDaS Depth estimation. Especially in specific condition in image.
I took experiment with 3 patternps this time.

This is memo for my own sake, so I wrote this article as I like, not as formal.

bad pattern

This is bad pattern.

Original image

Predicted image


Why this is bad pattern

1. The target object for prediction are small
2. Very close object are in sight and disturbing prediction. In this case, edge of car is the one.
3. The image had to focus on plotting target object for estimation mainly.


Total prediction stream

Better pattern

This is better pattern. I changed image conditions.

Changed conditions

1. I cropped height and width both. Also erase unnecessary parts both side.
2. Let target objects for estimation plot bigger than previous bad pattern.
Original image


Predicted image


Why this is better pattern

1. Target object for prediction in image are plot big enough to predict.
2. Cropping not moving sight(unnecessary parts) and these effect made prediction better

Total prediction stream(gif)


Best pattern

This is best pattern. I changed image conditions more.


Changed conditions

1. Crop height and erase unnecessary parts.
2. I didn't crop width to keep image balance, which means adjusting image size well-balanced enough to see target object and easy enough to predict
3. Let target objects for estimation plot big


Original image


Predicted image


Why this is best pattern

1. Target object for prediction in image are plot big enough to predict.
2. Image size are well-balanced so it became easy to predict.
=> in previous case, cropped too many, though target objects are big enough to predict. However image size are unbalanced for prediction.

3. The target object is big and image size is well-balanced. These 2 make prediction work better than any pattern

Total prediction stream(gif)


To make MiDaS work better, you have to adjust conditions good for prediction.