Arrays

Data structures Series  – Blog 01

Hi Guys! It’s September and as promised in my last post, a data structure blog is ready and waiting to be read.*** Drum-rolls please ***  Here’s my first blog in the Data Structure series. 😉


Data Structures

Data structures refer to data representation or data organization for holding program data in the computer memory aiding with the processing or operations to be performed on the data.


Arrays

Arrays are basic data structures in which :

a. All values held by a given array are of the same type – homogeneous

b. Values are allocated contiguous locations in computer memory


Arrays in Memory

I believe it may be a good idea for a programmer to know about behind the scenes of the code written, for efficient programming – that is how code interacts with Memory.  This block is mainly intended at visualizing how arrays are stored in computer memory.

Declaration and usage of arrays need us to know the element_type , name_for_array and array_size :

element_type indicates the data_type (such as int , char, float etc) of the element_value that shall be stored in a given array.

name_of_array – serves as constant pointer to first element of the array when we need to look up or change element_values in the array.

array_size shall indicate the number of elements held by the given array.

Fig 1 shows the side by side view of memory representation of two arrays A and T representing a 1-D and a multidimensional array respectively.

Real world data representation shows how we visualize the data that is to be stored. Declaration shows the code used to declare an array to hold this data and “corresponding memory layout” shows how it is actually stored in computer memory.  In case of 1-D arrays , this is pretty straight forward, how you see is also how  the computer represents data. However, in case of multi-dimensional arrays on the right side , note that how we see data and how it is stored varies. Data is stored row wise , each row  translated to a 1-D representation starting with 1-D array for elements of row 0 {10,20,30}, followed by row 1 {40,50,60} and so on.

For multi-dimensional arrays can be stored as row-major (as depicted) or column major. In case of column major, each column shall be represented like a 1-D array at a time, instead of row.

arr_addressing.png
Fig 1. Memory Layout of arrays

Now that we have an idea of how arrays are represented in the memory, we can take a look at operations on arrays next.


Array operations

1-D Arrays

  • Various supported declarations and initialization of arrays in C/C++ is as shown below.
/* Global arrays that are initialized without size being specified */ 
int c[]={2,3,4};

/* Global arrays without explicit initialization are 
   set to zero by default */
int d[10];

//local declarations of arrays
int main(){
 /* Array size variables must be declared as constants */ 
    const int a_size = 5, b_size = 3, c_size= 3;

 /* array 'a' is initialized with element values */ 
    int a[a_size] = {3,4,5,1,6}; 

 /* array 'b' is initialized without any values using {}   
    all elements shall be by default initialized to zero*/ 
    int b[b_size]={};
}
  • Array elements are referenced by an index. The value of index ranges from 0 to n-1 for an array of size ‘n’. Highlighted in red shows how an integer array can be passed to a function either as an integer pointer or array with dimension or no dimension specified as shown below.
/* array as function param can be passed as a pointer */
void print_arr(int* arr, int n){

/* Given array size, we can look up array elements using index ,    
in this case i , that takes values from 0 to n-1*/ 
for (int i=0; i<n;i++){
 std::cout<<arr[i]<<std::endl; 
}
}

/* another way to pass array to functions 
   using dimension-less array representation */
void print_arr(int arr[], int n){
}

/* array can also be passed as a sized array */
void print_arr(int arr[5], int n){
// Note: in case you pass some other dimension array, 
   no bound check is performed. So, no error reported */
// check the section below under "Fact, Did you know" :)
}

Multi-dimensional Arrays

/* declaring multi-dimensional array of size 2x2 */
/* each row is like a 1-D array, here 2 rows are 
   declared one after the other*/
int arr[2][2]={{3,4}{5,6}};

Assigning values to various indices and look up of element value at a given index is O(1) operation.

However, if you need to rearrange or move around data, it can get expensive. For example, as shown in Fig 2, if we want to move the last element to start of the array of size ‘n’, we will need n-1 copy operations to move the elements downwards and then place the element to the beginning of the array. We can clearly see this can get expensive when the number of elements get higher and if this operation is to be performed frequently.

Also, in case of arrays, another problem is static allocation. If we do not have an idea of  how much space is to be allocated, we may end up allocating more than required and wasting space or less than required – in which case we would need reallocation.

There is another data structure that takes care of these problems – namely “linked lists” (coming up next week ! 🙂 )

arr_expensive.png
Fig 2. operations on arrays to rearrange data

Relation of arrays and pointers

If you are working on cpp or Java, worry not my friend ! 😉 it’s all taken care of, for you. While C programmers, take some time to wrap your head around this. It is simple once you understand ! 🙂

// quick pointer refresher 
For a given pointer : int* ptr ;
ptr is a pointer to an integer data

ptr holds the address 
*ptr gives the value at the address held by ptr // de-reference

// for pointer to pointer case 
int** p_to_p;

*p_to_p holds the address
*(*p_to_p) gives value at address // de-reference

Let’s consider a 2-D array :

int a[m][n];
//2D array is an array of 1-D arrays per row
// 1D array name is nothing but a pointer to an integer
//2D array name is hence a pointer to pointer to an integer
  • A[0] is address of row 0
  • Value of element in column 0 of row 0, A[0][0], can be accessed using *(A[0]+0) , ie., A[0]
  • element 1 of row 0, A[0][1] is accessed using *(A[0]+1)
  • Similarly, A[1] is address of row 1
  • Value of element 0 of row 1, A[1][0], can be accessed using *(A[1]+0) , ie., A[1]
  • value of element 1 of row 1, A[1][1], can be accessed using *(A[1]+1)
*/ So, value of c'th element on r'th row , value can be 
   looked up using pointer as below.*/
   *(A[r]+c);    ... (i) 

/* Further, element A[r] can be indexed using an offset from A[0] as A[0]+r, 
   so A[r] value can be accessed using *(A+r).
   substituting in (i), we can look up value A[r][c] using*/
    *(*(A+r)+c);   ... (ii)

Summarizing above :

  • De-reference of A  (ie., *A) gives address of first element of ROW 0 or A[0]. A[0] is a pointer to an integer
  • De-reference of A[0]  gives first entry of A or A[0][0]. ie., **A = A[0][0]
  • In general , we can look up value at column ‘c’, of row ‘r’ using  *(*(A+r)+c))

Do read up further about array operation using pointers if you are interested.

Static and Dynamic arrays

Arrays can also be further split into two types : fixed size arrays  and dynamic arrays that can be resized. In Cpp we can use vectors in case we need dynamic arrays.


Application

If you are designing something and you have a checklist that has requirements below :

  • Efficient element look up
  • Don’t need to often  rearrange values

Arrays would be a good data structure to use, for they support O(1) look-ups.

One quick use case I can think of is to hold look up tables of fixed size which are containers for some constants to be used in computations – arrays are your BAE ! 😀

Also, check adjacency matrix required here.


Fact , Did you know?

When you access an array element of invalid index, C/C++ compilers don’t complaint. The behavior is “undefined”. They may look up some random location, if the address is in valid data region of the memory and return some junk value.  Oh well ! Talk about silly coding mistakes, undefined compiler behaviors – resulting in hard bugs to debug in production, sigh!

C++ supports std::vector which is a template class. It supports dynamic resizing of arrays. This class supports at() member function for an element lookup, which shall perform guaranteed bound check. But, it also supports operator [] to lookup elements, that is retained same as in C arrays and performs NO BOUND check.

/* declare a vector type and add only two elements */
std::vector<int> arr;
arr.push_back(1);
arr.push_back(2);

/* Try to access out of bound index 
   only valid access indices are 0 and 1 */
std::cout<<arr.at(2)<<std::endl; // error thrown
std::cout<<arr[2]<<std::endl; // NO ERROR

Related Reading

I am adding some links to my previous blogs as well as other material below for further reading.  Do check them out 🙂

  1. Interaction of program with memory
  2. Data structures for graph
  3. Char arrays
  4. Arraylists in Java 
  5. Solve programming problems related to Arrays

Until next time – keep decoding !

If this was helpful or you have some feedback, do write to me at decodergirlblog@gmail.com 🙂 OR :


Did you know?

More coffee you donate, higher is your coder karma! Oh! there shall also be more blogs published here. Thank you for being awesome! 😀

$1.00

Tail Recursion

In my previous blog on recursion, we noted the comparison between iterative and recursive implementations. Now, if we want the memory consumption of  an iterative approach and the elegance of a recursive implementation, then I had mentioned that we would need to take a look at : Tail recursion! So, let’s start decoding this.


What is Tail Recursion ?

Tail call refers to an operation where we call a function say, g() and immediately return it’s return value as the return value of our function f(). In this case, we will not  need to preserve any of the state of the current code (state of g()) – since we are just about to throw it away and return.

Tail recursion is a form of recursion where the recursive call (call to same procedure f() itself) is the absolute last operation performed (called recursive tail call). Basically, meaning that there is nothing left to execute in the function after recursive calls – ie., the function returns the result of a recursive call.

The result of recursive call in this case can be passed directly for extra computation,  instead of waiting for it , which translates to no extra stack space. Normal recursive call, on the other hand, has a stack frame allocated, so that the compiler remembers where to get back to and to have the necessary variable values ready after recursive call is finished. This optimization is called – tail call optimization (TCO).

To write tail recursive version of the function, it means that we need to make sure recursive call is the absolute last operation in the function – there should not be any extra operations performed after the recursive call.


Let’s consider a very simple example to understand this concept.

Example

Consider a recursive function that returns the sum of ‘n’ positive numbers.

I .Recursive implementation

  • The recursive formula for sum of “n” numbers is given by: sum(n) = n+sum(n-1)
  • and sum is 1 when input is 1. (base case)

int sum(int n)

{

// Base condition for input = 1.

if(n==1) {

return 1;

}

// cases greater than 1.

else {

return n+sum(n-1); //This has extra operation of adding “n” to sum returned from recursive  function call

}

}

II. Implementing tail recursive version

  • To re-write sum function above in tail recursive form, all we have to do is :
    1. Identify the extra operation performed between return and recursive call . In this case addition of “n” to what is returned from recursive call.
    2. Modify and re-write the function to do this extra operation internally and then return, usually involves passing extra state variable called accumulation.
    3. Retain only recursive function call as the last operation in the method.

Now, let’s give tail recursive implementation a shot using the three steps outlined above.

  • In the recursive implementation above (under I),  we can see that there is an extra operation of adding output of recursive call to “n” at the end as highlighted in red.
  • For step 1 : The last line in the implementation can be split and written as :
  • y = sum(n-1);  
  • result = n+ y ;  //
  • return result ; 

For steps 2/3 : The new function version will take an extra argument which is an accumulation of current value. So, the extra operation to be performed is taken care of in the function call itself and not after return from recursive function call. Note : This style of passing control explicitly is called “Continuous Passing Style“. This makes sure that recursive call is also the last operation performed.

r2


Role of Compiler

Modern compilers support tail call elimination for tail recursive functions- an optimization called “TCO” (Tail Call optimization) as briefly mentioned above. TCO is the process by which a compiler can make a call to a function and take no additional stack space. The only case this would happens is if the last instruction executed in a function f() is a call to a function g(). Of course, in recursive case the call will be to itself f().

I have personally programmed in C and Python. GCC compiler (C compiler) offers TCO (Tail call optimizations).

Essentially, when we re-write the function in tail recursive form, the compiler optimization looks like below. It would be converted to a goto instead of recursive calls with stack allocation for it to remember the outcome of each recursive call to be used for the extra operation. This means that we  will not hit the stack over flow problem. Yay!

int tail_recursive_sum(int n, int accum=0) {
label:
    if (n == 1) return n+accum;
    accum += n--;
    goto label;
}

However, python implementation does not support TCO internally, which means it is expected that we judge the depth of recursion and re-write the iterative version, if we want to be safe with stack usage.

Generally, the need for recursive call optimization is dependent on the performance of your algorithm. If you have to implement a program that takes O(n) operations, it is not a good idea to use recursive version, while if it is O(logn) like tree implementations, we should be okay using the recursive version, we may not blow up stack.

Of course, I have shown a simple example here. There may be more complicated use cases. May be I will write about them soon in another blog. 😉

I want to keep the tidbit entries concise !


Conclusion

  • The function executes a little faster, since no stack pushes and pops are required
  • Function recursion depth is not a constraint any more, as there is no linear growth of stack usage with recursive calls.
  • We will not hit the stack overflow issue with this implementation.

So, if you are coding in C , you could implement a tail recursive way to get the benefit of compiler TCO, while for python use the iterative version! :). That’s all for today !

Check this for more details on this topic w.r.t C++ as mentioned by John in the comments below.  I will need to catch up on this as well. So, until next time,  keep decoding !

Demystifying Grace Hopper Celebration

People often ask me why is there a conference only for women; isn’t the day we have no such conference – the time we have truly achieved equality? I do agree with this thought. But, to me this conference is not a celebration “for minority” but a celebration of the fact, as to how far we have come from being termed the minority.

Grace Hopper Celebration is the world’s largest conference for women in computing that has grown from a mere 500 women who came together in 1994 to 15,000 plus participants this year. This truly highlights the progress we are making on closing the gap between men and women in the field of Computing. It is needless to say how motivating and inspiring it was to learn about some wonderful women in the field and the contributions they have been making.

I had the opportunity to represent Qualcomm and be part of this amazing conference for the second time. I was introduced to this conference by fellow engineer Molly Nicholas, with whom I had collaborated on the idea of designing a hardware called Qbadge in the year 2015 – added wireless connectivity to the Hardware – (what I do best, of course ! ;)) and also some new neo-pixel patterns to light up the badge (check video below).  This was one of the few times, I had designed something on a basic breadboard and tested (pic below). It’s awesome what you explore and do when you collaborate.

Qualcomm hosted a Hackathon event that utilized the Qbadge hardware again in 2016. I was excited to be part of this event, to meet and answer the questions from the participants – young women engineers chosen from across the United States. The idea was to have a useful social interaction, make it a learning experience for the girls by exposing them to low level programming. We had them edit the programs loaded on the badges, while they solved some hackathon clues.

screenshot_2015-07-13-17-32-29-2    screenshot_20170304-125705

Pic 1 : Arduino Trinket interfaced with Bluetooth HC -05 chip

Pic 2: Internals of QBadge

Video below : Shows how the badge flashes with the programmed neopixel patterns that can be activated by a command from an Android app sent over Bluetooth connection:

One of the feedback I had from the first time at the conference in 2015 was that more men should also participate in this conference. Q showed an increased participation from the previous year , from 50 participants to 130 that included a few men as well. At the booth, Qualcomm had an invisible museum- named invisible but truly impactful Augmented Reality display. This seemed in-line with Megan Smith’s opinion she shared with a quote during her key note : “Women have always been an equal part of the past. We just haven’t been a part of history” – Gloria Steinem. This is very much the premise for Grace Hopper Conference – to recognize and highlight the contributions that women are making as well.

In my opinion, it would be a  good idea for more men to be a part of this conference to not only learn about and appreciate the contributions made by women but also to gain awareness about how they could aid with the efforts to close the gender gap, by getting involved in discussions about the problems or challenges faced by women. There were a number of men who were not only present at the conference but also made presentations such as this talk by Astro Teller, Google X.

Moving on, the technical theme for the year 2016 included Internet of Things, Artificial Intelligence, Human Computer Interaction and Data Security. While various technical talks in all the above tracks, such as self driving cars, programming humanoids, connecting millions of devices – engaged me and were giving a glimpse of tomorrow, there were also talks about solving the connectivity problem that had my attention. With the world progressing towards Drones, Smart Vehicles and Devices, Internet of Things – an all connected paradigm, it’s hard to imagine that two thirds of the world still remains unconnected – has no access to basic internet. As an Engineer at Qualcomm Research, I am excited to be part of the team that is working on connecting the unconnected.

Working on a myriad of problems and finding solutions for the questions of today and tomorrow, is what makes engineering a challenging as well as an enjoyable task. The term engineer has no gender. I believe that building a future we envision, takes each one of us and the path to that is – coming together and looking beyond the gender!

Lastly, to give you a sneak peak of how women are getting ahead on closing the gender gap in the tech industry, the winner of ABIE award for Technical Leadership,2016: Anna Patterson tried a simple way using the “show of lights” from the mobile phone to represent the show of strength. (video below)

But, it was her words : “No matter what you are going through, there is always a lesson to be learned and light at the end of the tunnel. Make sure you are looking for the light”  – that resonated  with me as I witnessed the entire hall light up around me!

More importantly, this made me realize how each one of us can make a difference to the over all picture. This post is dedicated to all the women engineers out there. Thank you to each of you for doing your bit and inspiring me each day to keep going. Together – we got this ! 


Note : This post in response to the Day-04 prompt word : “inspiration” – created by thestemsquad on Instagram in the on-going month of ‘get to know women in STEM’ initiative.