Skip to content

Tag: code

How to easily image search with python

image search with python

This is the second time that I’m writing on how to do image search with python. The first blog post. That I wrote about the subject got a lot of interest and even today I regularly get people commenting on it or coming to the github repo asking for help. So I figured it was time for a refresher.

Python imagesearch is now a pip-eable package

I have put a bit of work to put the library as a package. In order to allow you to just pip the library. This is a much better solution than me saying nonsense like “copy the file in your project”. Now it is as easy as doing :

pip3 install python-imageseach-drov0

The above will probably fail or you won’t be able to use the library as you need extra packages depending on your os :

Linux

sudo pip3 install python3-xlib
sudo apt-get install -y scrot -y
sudo apt-get install -y python3-tk
sudo apt-get install -y python3-dev
sudo apt-get install -y python3-opencv


MacOs

brew install opencv
pip3 install -U pyobjc-core
pip3 install -U pyobjc

Windows

No extra installation steps needed 🙂

Quick start

The simplest example to do image search with python is this:

from python_imagesearch.imagesearch import imagesearch

pos = imagesearch("./github.png")
if pos[0] != -1:
print("position : ", pos[0], pos[1])
else:
print("image not found")

Simply search for one occurrence of the image “github.png” on the screen and print its x/y position

Other functions

imagesearcharea

Performs an image search on a specific rectangle of the screen, it’s very useful to speed up searches as there will be less screen space to search.
It’s also useful to focus the search only on a specific part of the screen to reduce the chances of having a false positive.

pos = imagesearcharea("./github.png", 0, 0, 800, 600)
if pos[0] != -1:
    print("position : ", pos[0], pos[1])
else:
    print("image not found")

Input:
image : path to the image file (see opencv imread for supported types)
precision : the higher, the lesser tolerant and fewer false positives are found default is 0.8
im : a PIL image, usefull if you intend to search the same unchanging region for several elements

Output:
the top left corner coordinates of the element if found as an array [x,y] or [-1,-1] if not

region_grabber

Very useful to optimize imagesearcharea or imagesearch calls, by getting an already processed image you can perform multiple searches on it with great speed gains. Here’s an example

# non -optimized way :
time1 = time.clock()
for i in range(10):
    imagesearcharea("./github.png", 0, 0, 800, 600)
    imagesearcharea("./panda.png", 0, 0, 800, 600)
print(str(time.clock() - time1) + " seconds (non optimized)")

# optimized way :

time1 = time.clock()
im = region_grabber((0, 0, 800, 600))
for i in range(10):
    imagesearcharea("./github.png", 0, 0, 800, 600, 0.8, im)
    imagesearcharea("./panda.png", 0, 0, 800, 600, 0.8, im)
print(str(time.clock() - time1) + " seconds (optimized)")

# sample output :

# 1.6233619831305721 seconds (non optimized)
# 0.4075934110084374 seconds (optimized)

Input: a tuple containing the 4 coordinates of the region to capture tuple should contain coordinates of : topx, topy, bottomx, bottomy

Output: a PIL image of the area selected.

imagesearch_loop

Searches for an image on screen continuously until it’s found, useful to make a waiting script until x image appears. For instance waiting for the end of a loading screen.

from python_imagesearch.imagesearch import imagesearch_loop

pos = imagesearch_loop("./github.png", 1)
print("position : ", pos[0], pos[1])

Input:
image : path to the image file (see opencv imread for supported types)
time : Waiting time after failing to find the image (seconds)
precision : the higher, the lesser tolerant and fewer false positives are found default is 0.8

Output:
the top left corner coordinates of the element if found as an array [x,y]

imagesearch_numLoop

Searches for an image on screen continuously until it’s found or max number of samples reached.

from python_imagesearch.imagesearch import imagesearch_numLoop

pos = imagesearch_numLoop("./github.png", 1, 50)
if pos[0] != -1:
print("position : ", pos[0], pos[1])
else:
print("image not found")

Input:
image : path to the image file (see opencv imread for supported types)
time : Waiting time after failing to find the image
maxSamples: maximum number of samples before function times out.
precision : the higher, the lesser tolerant and fewer false positives are found default is 0.8

Output: the top left corner coordinates of the element if found as an array [x,y]

imagesearch_region_loop

Very similar to imagesearch_loop except it works with regions

from python_imagesearch.imagesearch import imagesearch_region_loop

pos = imagesearch_region_loop("./github.png", 1, 0, 0, 800, 600)
print("position : ", pos[0], pos[1])

Input:
image : path to the image file (see opencv imread for supported types)
time : Waiting time after failing to find the image
x1 : top left x value
y1 : top left y value
x2 : bottom right x value
y2 : bottom right y value
precision : the higher, the lesser tolerant and fewer false positives are found default is 0.8


Output:
the top left corner coordinates of the element as an array [x,y]

imagesearch_count

Counts how many occurrences there are of the image there are on the screen.

from python_imagesearch.imagesearch import imagesearch_count

count = imagesearch_count("./github.png")
print(count)

Input:
image : path to the target image file (see opencv imread for supported types)
precision : the higher, the lesser tolerant and fewer false positives are found default is 0.9

Output:
the number of times a given image appears on the screen.
optionally an output image with all the occurances boxed with a red outline.

imagesearch_from_folder

Performs an imagesearch on all the images in a folder. This function was done by kadusalles

from python_imagesearch.imagesearch import imagesearch_count

results = str(imagesearch_from_folder('./', 0.8))
print(results)

Input:
path: to the folder containing the images (supported image types are jpg, gif, png and jpeg)
precision : the higher, the lesser tolerant and fewer false positives are found default is 0.9

Output:
A dictionnary with all the images where the key is the image path and the value is it’s position

Conclusion

And that’s about it ! Now you should be able to easily perform Image search with python. If you are interested in the actual code or want to contribute feel free to head on over to the github repository : https://github.com/drov0/python-imagesearch and if you liked my article, come to see more at https://brokencode.io

What I learned coding (almost) every day for a year and a half

Two years ago I started working on steem, it’s a really cool blockchain where your upvotes are worth actual money. There are quite a few decentralized apps tha run on Shortly after a startup called utopian.io launched where open source contributions were rewarded. I figured it couldn’t hurt to try and it was the extra motivation I needed to finally start working on open source.

I’ve been wanting to work on open source software for quite some time so I opened my github account and started coding, I mostly worked on my own projects releasing a ton of code snippets to help future developpers work on steem. But I also open sourced some of my personal libraries for instance https://github.com/drov0/python-imagesearch a python library to easily perform image search for automation. (I mostly used it to automate games at the time).

The plan was never to code every day at all but then, Github has one feature. One feature that made me work on it rather than, say, gitlab or bitbucket :

The contribution graph

The contribution graph is a simple thing, whenever you do a commit, open an issue or a pull request it’ll add a green square for the day. And the more actions you do that day the darker the square. You can’t really say “oh the square is super dark so you did a lot that day” but it’s a good rough metric.

I didn’t really take much attention to it at first, but the more I looked at it the more it was talking to me, it said

“Can you code every single day, for an entire year ?”

It was daunting at first but then I took on the challenge. And started coding. Around the same time I was launching as a side project my startup SteemPress, a plugin to connect wordpress and steem. And since I love programming it wasn’t hard for me to come home from 8 hours of programming (I was a software engineer working on embedded software) and then sit on my computer to work some more.

Little by little the squares started to fill and I felt prouder and prouder about my ongoing achievement. And this is where I learned my first big lesson

Consistency is hard

Coding every day sounds ok in a week, or even in a month. But then there are edge cases that you didn’t really thought about when you started in this whole endeavor :

  • What do you do if there’s an after-work party right after work and you know you won’t be home before midnight ?
  • What do you do if you get sick ?
  • What do you do if are on vacations ?
  • What do you do if you don’t want to work ?

This is where you are at a point where you think of the rules, a green square is just one commit, does adding a bit of documentation on a README on your phone count ? Does adding a line of comment count ? What is cheating and what is not ? In the end I decided that it was okay to edit readmes or adding comments as long as it was

  • Relevant (adding an useless line or fixing a typo is cheating)
  • Not too often (once every two weeks top)

The x effect

There are studies that show that if you do something every day for 30 to 45 days it becomes a habit. And that effect is greatly amplified if you have something physical like a box to tick if you did your daily habit correctly. There’s even a subreddit dedicated to that. At the time I wasn’t aware of it but Github and the graph sure as hell had that effect on me, over time consistency wasn’t as hard as before and became more of a habit, I was longing for my daily side project programming. It was thrilling. I learned a ton, taught myself a lot of new technologies and contributed to the open source community (and earned a few bucks while doing so).

But after a year and a half I decided to quit. Why you ask ?

Quitting

The contribution graph for 2019

I recently had a falling of motivation (it shows on my Github graph around march to may). During that time it was a real hassle to do my daily programming tasks, I didn’t want to do it but I still wanted my green mark so I was cheating more and more, making minor edits, making comments that aren’t really necessary etc. Finally I said “Stop, if you’re gonna cheat you might as well not push anything”. Also Github only adds things to the contribution graphs if you push to master, so I was sometimes working with bad practices because I didn’t want to use branches. (The contributions are counted when you merge the branch to master but it means not having green boxes for quite a while if you work on that branch for weeks). So I realized that this habit was doing more wrong than right. For me at least.

In conclusion

This was a great experience, if you go to my Github you can see that even though I said I “quit” I still push things very often and rarely miss a day, the difference is that I don’t push myself to program every day, I just let it happen. I definitely recommend that you check out steem and utopian.io and follow the same path as I did, it teaches you a lot, it’s also a good way of selling yourself and your skillset to future employers. It’s no longer “hey I love programming, trust me, I do” it’s “look at my Github, see how I code, see my skillset” and if the guy know what he’s talking about he’ll know.
Especially if you have good open source projects to show. In my experience people rarely check your Github but when they do it’s big plus to show your dedication.