Personal computing discussed

Moderators: renee, SecretSquirrel, just brew it!

 
Igor_Kavinski
Minister of Gerbil Affairs
Topic Author
Posts: 2077
Joined: Fri Dec 22, 2006 2:34 am

Recursion (split from Dynamic Programming Tips and Tricks)

Tue Apr 21, 2020 3:17 am

Even though I don't understand recursion (how the computer does it), I heard someone say that it helps to visualize a problem in terms of stuff being pushed onto a stack and once the base condition is reached, the computer will automatically start popping stuff out of the stack in LIFO order and doing whatever needs to be done with that output (treating it as an input for some other function for instance). I could be wrong though. Corrections are welcome.
 
just brew it!
Administrator
Posts: 54500
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Recursion

Tue Apr 21, 2020 6:16 am

I split this, since it was actually a spam necro of a 10-year-old thread; but Igor's followup post may be of some interest to any of the newbie coders here.

Yes, recursion is typically implemented with a stack; but that's an implementation detail.

The core concept is one of "divide and conquer". You take a problem and decompose it into smaller problems, and keep doing that until you've arrived at a trivial case. Not all problems lend themselves to a recursive solution, but it can be a powerful tool when used appropriately.

One common example is the efficient array sorting algorithm called "quicksort". You pick an element of the array (called the "pivot"), then shuffle the rest of the elements around such that all elements which are less than the pivot appear before it in the array, and all elements which are greater than the pivot appear after it. You then repeat the same process on each half of the array (below and above the pivot), and keep doing this until you reach the trivial case of sorting only one or two adjacent elements of the array. At the end of this process, you have a fully sorted array.
Nostalgia isn't what it used to be.
 
Igor_Kavinski
Minister of Gerbil Affairs
Topic Author
Posts: 2077
Joined: Fri Dec 22, 2006 2:34 am

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Tue Apr 21, 2020 7:22 am

So my point of confusion is, how do you handle the shuffling of two different halves using recursion?
 
just brew it!
Administrator
Posts: 54500
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Tue Apr 21, 2020 11:57 am

The partitioning step isn't the recursive part.

https://www.cc.gatech.edu/classes/cs315 ... ksort.html
Nostalgia isn't what it used to be.
 
dragontamer5788
Gerbil Elite
Posts: 715
Joined: Mon May 06, 2013 8:39 am

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Tue Apr 21, 2020 1:55 pm

Igor_Kavinski wrote:
Even though I don't understand recursion (how the computer does it), I heard someone say that it helps to visualize a problem in terms of stuff being pushed onto a stack and once the base condition is reached, the computer will automatically start popping stuff out of the stack in LIFO order and doing whatever needs to be done with that output (treating it as an input for some other function for instance). I could be wrong though. Corrections are welcome.


That's how a computer implements recursion. I personally only really understood it once I got down to the assembly level and started working with the stack. But whenever I explain things from that perspective, a lot of other people seem to get confused...

But the way recursion is typically taught (and how most other people seem to understand it), is approaching it from the mathematical angle through recurrence relations. Consider the sequence of triangle numbers:

Image

* TriangleNumber(1) == 1
* TriangleNumber(2) == 3
* TriangleNumber(3) == 6
* TriangleNumber(4) == 10
* TriangleNumber(5) == 15

Its kind of hard to come up with a formula for the triangle numbers. But its very easy to come up with a mathematical recurrence relationship.

TriangleNumber(n) = TriangleNumber(n-1) + n
TriangleNumber(1) = 1

This "recursive" definition works as long as you have a "base case", where the recursion stops. If you didn't have a base case, then you have an infinite loop (which actually works in some languages: such as Prolog).
 
Redocbew
Minister of Gerbil Affairs
Posts: 2495
Joined: Sat Mar 15, 2014 11:44 am

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Tue Apr 21, 2020 2:08 pm

I'm not sure there's any way you can really understand recursion without coding a Fibonacci generator or something which uses it.
Do not meddle in the affairs of archers, for they are subtle and you won't hear them coming.
 
just brew it!
Administrator
Posts: 54500
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Tue Apr 21, 2020 2:16 pm

I actually dislike using Fibonacci (and factorial) examples to illustrate recursion. Not because they don't illustrate recursion, but because in the real world, they tend to result in excessively deep recursion which will typically cause a stack overflow error unless the inputs are small. In the real world, these would be implemented with an explicit loop rather than recursion, to avoid this issue.
Nostalgia isn't what it used to be.
 
Redocbew
Minister of Gerbil Affairs
Posts: 2495
Joined: Sat Mar 15, 2014 11:44 am

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Tue Apr 21, 2020 2:31 pm

Yeah, it's not the best example in practice, and there's ways to estimate the nth Fibonacci number without calculating all the others anyway. Unfortunately, a lot of the things which really do require recursion are weird and tend to distract from the process of thinking recursively.
Do not meddle in the affairs of archers, for they are subtle and you won't hear them coming.
 
dragontamer5788
Gerbil Elite
Posts: 715
Joined: Mon May 06, 2013 8:39 am

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Tue Apr 21, 2020 3:04 pm

just brew it! wrote:
I actually dislike using Fibonacci (and factorial) examples to illustrate recursion. Not because they don't illustrate recursion, but because in the real world, they tend to result in excessively deep recursion which will typically cause a stack overflow error unless the inputs are small. In the real world, these would be implemented with an explicit loop rather than recursion, to avoid this issue.


Fibonnacci is a great example, because it actually takes you all the way to Dynamic Programming. You start with recursion, then you discover that its extremely inefficient recursion. You then solve the inefficiency with dynamic programming.

And yes, an explicit loop is almost always faster. But its difficult to prove that the explicit loop actually results in the same numbers. Recursion / Dynamic programming matches the recursive formulas that mathematicians use. When you move on to more complicated examples where you don't fully understand the math (but still have to make something work), that's where recursion + dynamic programming becomes much better.
 
Wirko
Gerbil Team Leader
Posts: 296
Joined: Fri Jun 15, 2007 4:38 am
Location: Central Europe

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Tue Apr 21, 2020 3:54 pm

Listing a directory tree requires recursion too and is simple enough to comprehend. Igor, think about having a file explorer that just shows you the contents of a single directory, not the tree, and lets you go into a subdirectory or one level up. How would you go about drawing a whole tree? What would be the basic program steps in this case, what would be the stack and the base condition?

Some similar problems I can think of: area fill (in raster graphics), finding a path through a maze (I guess you've done this before, both in real world and on paper), drawing a corporate structure from a list of employees and their respective managers.
 
Igor_Kavinski
Minister of Gerbil Affairs
Topic Author
Posts: 2077
Joined: Fri Dec 22, 2006 2:34 am

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Tue Apr 21, 2020 7:16 pm

just brew it! wrote:
I actually dislike using Fibonacci (and factorial) examples to illustrate recursion. Not because they don't illustrate recursion, but because in the real world, they tend to result in excessively deep recursion which will typically cause a stack overflow error unless the inputs are small.


I have always wanted to know this. How does directory recursion work without overflowing the stack, as the directory can have n levels of depth?
 
dragontamer5788
Gerbil Elite
Posts: 715
Joined: Mon May 06, 2013 8:39 am

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Tue Apr 21, 2020 8:56 pm

Igor_Kavinski wrote:
just brew it! wrote:
I actually dislike using Fibonacci (and factorial) examples to illustrate recursion. Not because they don't illustrate recursion, but because in the real world, they tend to result in excessively deep recursion which will typically cause a stack overflow error unless the inputs are small.


I have always wanted to know this. How does directory recursion work without overflowing the stack, as the directory can have n levels of depth?


Because we have 8GB of RAM on most machines these days, the stack won't really run out of memory until you fill that up.

The OS does a bit of virtual-memory magic to move the stack around physically, while the programmer only has to think of logical code addresses. But ultimately, the OS can move the stack around (even to the hard drive temporarily with "swap"). And the OS is very good at moving memory around and using it efficiently, be it Linux or Windows.
 
Igor_Kavinski
Minister of Gerbil Affairs
Topic Author
Posts: 2077
Joined: Fri Dec 22, 2006 2:34 am

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Tue Apr 21, 2020 10:04 pm

dragontamer5788 wrote:
The OS does a bit of virtual-memory magic to move the stack around physically, while the programmer only has to think of logical code addresses. But ultimately, the OS can move the stack around (even to the hard drive temporarily with "swap"). And the OS is very good at moving memory around and using it efficiently, be it Linux or Windows.


The stack in a function is the register space and the stack outside the function (global variables) is the heap, correct?
 
Redocbew
Minister of Gerbil Affairs
Posts: 2495
Joined: Sat Mar 15, 2014 11:44 am

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Tue Apr 21, 2020 10:40 pm

It depends on the language also. In C, that sounds correct. Variables declared within a function are stored on the stack while globals are stored on the heap. Additional memory you request from the OS typically comes from the heap as well. On what device that data is physically being stored(main memory vs hard drives vs toaster ovens) is an implementation detail. Compare that to a language like Python which does memory management for you and things may be different. In all of the Python apps I've written over the past few months my data could be on the stack, or the heap, or some combination of both. I've got no idea.

That's not to say the inner workings of a high level language like Python are unimportant. It may be required of you to know that stuff in order for your application to behave as expected, but it's not necessarily required by the language just to get things up and running.
Do not meddle in the affairs of archers, for they are subtle and you won't hear them coming.
 
dragontamer5788
Gerbil Elite
Posts: 715
Joined: Mon May 06, 2013 8:39 am

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Tue Apr 21, 2020 11:46 pm

Igor_Kavinski wrote:
dragontamer5788 wrote:
The OS does a bit of virtual-memory magic to move the stack around physically, while the programmer only has to think of logical code addresses. But ultimately, the OS can move the stack around (even to the hard drive temporarily with "swap"). And the OS is very good at moving memory around and using it efficiently, be it Linux or Windows.


The stack in a function is the register space and the stack outside the function (global variables) is the heap, correct?


No. C doesn't have registers. Registers are an assembly thing and you can almost completely ignore them in higher level languages. (Exceptions: volatile memory locations, obscure threading issues, memory barriers. The stuff of experts. Beginners simply will not come across these situations).

--------

Virtual memory in C / Windows / Linux is just a giant array that you can access in the program. The stack is somewhere in memory, you don't care where it is exactly. The C compiler works the stack to create local variables inside of a function.

The heap is completely different (and has lots of meanings). I assume you're talking about the malloc / free heap, which is just... the memory locations where malloc and free operate. When you request malloc(), you get data from the so called "heap". And when you say "free", you tell the memory-manager that you aren't using that memory location anymore and that they can recycle it into another malloc() call later.
 
Igor_Kavinski
Minister of Gerbil Affairs
Topic Author
Posts: 2077
Joined: Fri Dec 22, 2006 2:34 am

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Wed Apr 22, 2020 12:15 am

dragontamer5788 wrote:
No. C doesn't have registers. Registers are an assembly thing and you can almost completely ignore them in higher level languages. (Exceptions: volatile memory locations, obscure threading issues, memory barriers. The stuff of experts. Beginners simply will not come across these situations).

--------

Virtual memory in C / Windows / Linux is just a giant array that you can access in the program. The stack is somewhere in memory, you don't care where it is exactly. The C compiler works the stack to create local variables inside of a function.

The heap is completely different (and has lots of meanings). I assume you're talking about the malloc / free heap, which is just... the memory locations where malloc and free operate. When you request malloc(), you get data from the so called "heap". And when you say "free", you tell the memory-manager that you aren't using that memory location anymore and that they can recycle it into another malloc() call later.


I was under the impression that local variables meant faster access to data and global variables meant higher latencies. I've made a lot of global variables in my C programs for the ease of accessing them from any function without having to pass values but if my code were reviewed, it would be declared a big heap of stink coz you are not supposed to go global unless absolutely necessary.
 
dragontamer5788
Gerbil Elite
Posts: 715
Joined: Mon May 06, 2013 8:39 am

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Wed Apr 22, 2020 1:07 am

Igor_Kavinski wrote:
dragontamer5788 wrote:
No. C doesn't have registers. Registers are an assembly thing and you can almost completely ignore them in higher level languages. (Exceptions: volatile memory locations, obscure threading issues, memory barriers. The stuff of experts. Beginners simply will not come across these situations).

--------

Virtual memory in C / Windows / Linux is just a giant array that you can access in the program. The stack is somewhere in memory, you don't care where it is exactly. The C compiler works the stack to create local variables inside of a function.

The heap is completely different (and has lots of meanings). I assume you're talking about the malloc / free heap, which is just... the memory locations where malloc and free operate. When you request malloc(), you get data from the so called "heap". And when you say "free", you tell the memory-manager that you aren't using that memory location anymore and that they can recycle it into another malloc() call later.


I was under the impression that local variables meant faster access to data and global variables meant higher latencies.


If you care about speed, you need to measure the speed of your code. Don't assume, measure.

With that being said, computers don't work like that. Local variables might be faster due to how the L1 cache works (many variables would be overwriting each other on the stack, sharing L1 cache). But this has nothing to do with registers. Furthermore, if a global variable were accessed, it'd be placed into L1 cache and then be very fast as well.

I've made a lot of global variables in my C programs for the ease of accessing them from any function without having to pass values but if my code were reviewed, it would be declared a big heap of stink coz you are not supposed to go global unless absolutely necessary.


Global variables are bad because they have infinite scope. the only way to understand a global variable is to read ALL the code in the program. This is very difficult.

A local variable disappears at the end of the function. If you keep your functions small, maybe 20-lines typically and under 200 lines in most cases, then local variables are small enough that they're easy to think about. Global variables can interact across your entire codebase, maybe touching 10,000 or 10-million lines of code. Its just too difficult to think about globals the bigger and bigger your code gets.
 
just brew it!
Administrator
Posts: 54500
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Recursion (split from Dynamic Programming Tips and Tricks)

Wed Apr 22, 2020 7:58 am

dragontamer5788 wrote:
Igor_Kavinski wrote:
I have always wanted to know this. How does directory recursion work without overflowing the stack, as the directory can have n levels of depth?

Because we have 8GB of RAM on most machines these days, the stack won't really run out of memory until you fill that up.

The OS does a bit of virtual-memory magic to move the stack around physically, while the programmer only has to think of logical code addresses. But ultimately, the OS can move the stack around (even to the hard drive temporarily with "swap"). And the OS is very good at moving memory around and using it efficiently, be it Linux or Windows.

Well, yes and no. Maximum stack size is typically capped much lower, but a user with sufficient permissions can raise the cap. On the Debian system I am posting this from, it appears the default stack size limit is 8MB.

You would need a very deep directory hierarchy to exceed 8MB in a recursive traversal. Each additional level is probably going to add only a few hundred bytes to your stack at most. Good C/C++ programmers will avoid creating large data structures on the stack; if you need to allocate something big, that should go on the heap.

Igor_Kavinski wrote:
The stack in a function is the register space and the stack outside the function (global variables) is the heap, correct?

No, there is A register (the stack pointer) which keeps track of your current location in the stack, but the stack itself is stored in RAM. The heap is a subset of your global variables; not all global variables are part of the heap.

The difference between stack and heap is that the stack is accessed last-in, first-out (via the above mentioned stack pointer), whereas Items on the heap are created and destroyed dynamically as the program runs, and are accessed randomly via ad-hoc pointers. Non-heap global variables typically have a fixed location in memory, and exist for the entire lifetime of the program.

Redocbew wrote:
It depends on the language also. In C, that sounds correct. Variables declared within a function are stored on the stack while globals are stored on the heap. Additional memory you request from the OS typically comes from the heap as well. On what device that data is physically being stored(main memory vs hard drives vs toaster ovens) is an implementation detail. Compare that to a language like Python which does memory management for you and things may be different. In all of the Python apps I've written over the past few months my data could be on the stack, or the heap, or some combination of both. I've got no idea.

That's not to say the inner workings of a high level language like Python are unimportant. It may be required of you to know that stuff in order for your application to behave as expected, but it's not necessarily required by the language just to get things up and running.

While you are not accessing the "bare metal" stack and heap directly in Python, the interpreter does have its own stack and heap abstractions. So most of the same concepts still apply, but the implementation is "virtual" instead of mapping directly to hardware. By doing this, the Python interpreter can protect you from common C/C++ coding errors like uninitialized pointers, heap memory leaks and double-frees, etc... and you get helpful diagnostics when things do go wrong, instead of cryptic UAEs/segfaults.

Igor_Kavinski wrote:
I was under the impression that local variables meant faster access to data and global variables meant higher latencies. I've made a lot of global variables in my C programs for the ease of accessing them from any function without having to pass values but if my code were reviewed, it would be declared a big heap of stink coz you are not supposed to go global unless absolutely necessary.

It is somewhat CPU architecture dependent, but in general, no there should not be a significant performance penalty for global variables. Assuming we're talking about a compiled language like C/C++, a local variable is typically accessed at a constant offset from the current stack pointer value. A global variable is accessed at a constant offset from the program's data segment starting address. Either way, CPU is loading the offset value and adding it to the base address to get the address of the variable. Global might be SLIGHTLY slower if the offset value is wider (i.e. 64-bit vs. 32-bit), or if it results in a cache miss (unless the global is a frequently accessed one, items near the top of the stack are more likely to be in cache), but the difference is going to be negligible in most cases.

The reason use of globals is discouraged is not because of efficiency, it it is because in larger programs it results in code which is harder to debug, maintain, and re-use. Think of it this way: every global variable is a potential input or output to every function. Hidden side effects which are not related to the function arguments or return value make program behavior more difficult to analyze.

Edit: In a multi-threaded program, lock contention for globals can potentially become a serious performance killer, but this isn't directly related to how the variable is accessed at the hardware level.
Nostalgia isn't what it used to be.

Who is online

Users browsing this forum: No registered users and 1 guest
GZIP: On