Recursionerror: maximum recursion depth exceeded while calling a python object

I've built a crawler that had to run on about 5M pages [by increasing the url ID] and then parses the pages which contain the info' I need.

after using an algorithm which run on the urls [200K] and saved the good and bad results I found that the I'm wasting a lot of time. I could see that there are a a few returning subtrahends which I can use to check the next valid url.

you can see the subtrahends quite fast [a little ex' of the few first "good IDs"] -

510000011 # +8
510000029 # +18
510000037 # +8
510000045 # +8
510000052 # +7
510000060 # +8
510000078 # +18
510000086 # +8
510000094 # +8
510000102 # +8
510000110 # etc'
510000128
510000136
510000144
510000151
510000169
510000177
510000185
510000193
510000201

after crawling about 200K urls which gave me only 14K good results I knew I was wasting my time and need to optimize it, so I run some statistics and built a function that will check the urls while increasing the id with 8\18\17\8 [top returning subtrahends ] etc'.

this is the function -

def checkNextID[ID]:
    global numOfRuns, curRes, lastResult
    while ID < lastResult:
        try:
            numOfRuns += 1
            if numOfRuns % 10 == 0:
                time.sleep[3] # sleep every 10 iterations
            if isValid[ID + 8]:
                parseHTML[curRes]
                checkNextID[ID + 8]
                return 0
            if isValid[ID + 18]:
                parseHTML[curRes]
                checkNextID[ID + 18]
                return 0
            if isValid[ID + 7]:
                parseHTML[curRes]
                checkNextID[ID + 7]
                return 0
            if isValid[ID + 17]:
                parseHTML[curRes]
                checkNextID[ID + 17]
                return 0
            if isValid[ID+6]:
                parseHTML[curRes]
                checkNextID[ID + 6]
                return 0
            if isValid[ID + 16]:
                parseHTML[curRes]
                checkNextID[ID + 16]
                return 0
            else:
                checkNextID[ID + 1]
                return 0
        except Exception, e:
            print "somethin went wrong: " + str[e]

what is basically does is -checkNextID[ID] is getting the first id I know that contain the data minus 8 so the first iteration will match the first "if isValid" clause [isValid[ID + 8] will return True].

lastResult is a variable which saves the last known url id, so we'll run until numOfRuns is

isValid[] is a function that gets an ID + one of the subtrahends and returns True if the url contains what I need and saves a soup object of the url to a global varibale named - 'curRes', it returns False if the url doesn't contain the data I need.

parseHTML is a function that gets the soup object [curRes], parses the data I need and then saves the data to a csv, then returns True.

if isValid[] returns True, we'll call parseHTML[] and then try to check the next ID+the subtrahends [by calling checkNextID[ID + subtrahends], if none of them will return what I'm looking for I'll increase it with 1 and check again until I'll find the next valid url.

you can see the rest of the code here

after running the code I got about 950~ good results and suddenly an exception had raised -

"somethin went wrong: maximum recursion depth exceeded while calling a Python object"

I could see on WireShark that the scipt stuck on id - 510009541 [I started my script with 510000003], the script tried getting the url with that ID a few times before I noticed the error and stopped it.

I was really exciting to see that I got the same results but 25x-40x times faster then my old script, with fewer HTTP requests, it's very precise, I have missed only 1 result for 1000 good results, which is find by me, it's impossible to rum 5M times, I had my old script running for 30 hours and got 14-15K results when my new script gave me 960~ results in 5-10 minutes.

I read about stack limitations, but there must be a solution for the algorithm I'm trying to implement in Python [I can't go back to my old "algorithm", it will never end].

Thanks!

The maximum recursion depth in Python is 1000.

You can verify this by calling sys.getrecursionlimit[] function:

import sys

print[sys.getrecursionlimit[]] # Prints 1000

You can change the limit by calling sys.setrecursionlimit[] method.

For example:

import sys

print[sys.setrecursionlimit[2000]]

Consider this a dangerous action!

If possible, instead of tweaking the recursion limit, try to implement your algorithm iteratively to avoid deep recursion.

  • Python Maximum Recursion Depth Exceded in Comparison
  • Why Is There a Recursion Depth Limit in Python
    • What Is a Stack Overflow Error in Python
  • How to Change the Recursion Depth Limit in Python—Danger Zone!
  • Temporarily Change the Recursion Depth Limit in Python
  • Conclusion
  • Further Reading
  • Resources

Python Maximum Recursion Depth Exceded in Comparison

Whenever you exceed the recursion depth of 1000, you get an error in Python.

For example, if we try to compute a too large Fibonacci number, we get the recursion depth error.

# A function for computing Fibonacci numbers
def fibonacci[n]:
   if n 

Chủ Đề