Phase 3

(Warning that this is a rather wordy post: most of the tests/results/code modified will be located in my github posted below instead of here).
This is the wrap up for the project I selected. To recap,  I chose to try to optimize stdio.h.  I attempted to speed up specifically the w+ mode of fopen function, which is defined in libio/stdio.h. Hopefully I will be able to finish up what I’ve started during summer during my work term. SPO600 was an interesting course. The labs were a nice build up towards how to actually approach this project, and it was a little scary trying (and failing) to contribute to the open source community. This felt more like a real world experience than many of the classes at Seneca, where many of those labs/assignments have a very narrow field of application (i.e, requirements are already given, just code the functionality, etc). Nowhere were run-times needed or diving into the actual assembly of said code necessary, nor worrying about optimizing my code as long as it worked. It was a good experience to put concepts learned about optimization (attempt to) into software used in the open source community (glibc)

Having done tracing files and tracking down where to make changes I found it very challenging/difficult to find where to focus on in the code. Honestly, some of the changes I made did not create any meaningful results OR resulted in things “breaking” (test file not working in the way i expected, runtime upped).

I have not tried to make a pull request or send it upstream because I did not make any significant progress that is worth uploading in my opinion. I also did not reach the stage of testing another architecture aside from AArch64, simply because of what’s stated above (no point testing on other archtectures for speed if it doesn’t even necessarily speed up on the AArch64).

A huge difficult in working on the project was isolating a specific area of code to work with. I found it hard to isolate an obvious area where improvements could ACTUALLY be made. Searching through multitude of different files to try to locate where the w+ mode for fopen (in stdio.h) could be optimized was extremely difficult. First of all, I could not locate what I planned to modify’s source code (ending up tracing it to the int open function, but could not find the actual source code which does not make sense to me, since i have the latest version of glibc).

Another personal difficulty for me was just the allocation of my time spent into this optimization. Unfortunately, due to circumstances I wasn’t able to allot the time I wanted to this project, having prioritized family matters and other unfortunate events. This was warned at the start of this course to work on this project over a long period of time and that will be a lesson learned.

For future large assignments I would approach things differently. Its apparent that it’s always better to start earlier and work in smaller chunks rather than big ones, but also rather that unlike most school labs/assignments that are small in scope which can be done by oneself, I have to reach out sooner for help on larger tasks as people

Now, with a better understanding of what is involved in undertaking a project like this, here is how I would approach something like this again, especially since I plan to continue what I’m doing here during the summer. If I decide to undertake a project like this in the future I will reach out to the community sooner to ask their advice instead of relying almost exclusively on my own research. There are always people with more expertise on subjects, and its a waste not to use those resources. By working on something consistently, it also gives time for me to reach different resources (resources that may not always get back to you, etc). This would give me more insight into my project/assignments.

the link to my github is:

Above is where I’ve put my previous work, labs, and (hopefully) future implementation where I can find where to properly optimize this with an actual meaningful lower run-time for my tests.


Lab 3 (Assembly Lab)

For this lab, we wrote simple assemble programs. The program given to us printed the message “loop” 10 times. We were to change it to print numbers in ascending order. (First 1-9, then

Assembly is a lot more anal than higher level programming languages (C, C++, Java, etc). You must tell it exactly what to do (gives you higher level of control but responsibility is all on you), how to handle memory etc. Definitely I am more comfortable working with higher level languages as I am already experienced with writing and debugging such programs, compared to Assembly.

The x86_64 version of this lab was run on Xerxes, and the Aarch64 version of this lab was run on Betty.

Assembly seems like it has its uses and is interesting for programmers looking into actual machine code, but I had difficulty writing and debugging the code.

Lab 4 (Compiled C Lab)

For this lab we wrote a basic C program which printed a message on the screen, Hello World!:

#include <stdio.h>

int main() {
    printf("Hello World!\n");

This program is compiled with the following options

-g               # enable debugging information
-O0              # do not optimize (that's a capital letter and then the digit zero)
-fno-builtin     # do not use builtin function optimizations

(1) Added the compiler option -static.

  • Creates standalone application and adds all C libraries to the executable
  • Immensely increases size of file

(2) Removed the compiler option -fno-builtin.

  • Use of “puts” instead of “printf” -> takes only one argument to display the string
  • Results in a slightly faster compile time
  • -fno-builtin tells compiler NOT to use some optimizations, so removing it let the compiler choose to use “puts” to compile quicker

(3) Removed the compiler option -g.

  • Removes debugging information
  • File sizes decreases

(4) Added additional arguments to the printf() function in program.

  • Can add up to 6 in registers for printf before having to push it onto the stack
  • Program speed decreased due to having to pop out each argument from the stack

(5) Moved the printf() call to a separate function named output(), and called function from main().

  • Increases length of process -> main section has to address of “output” to THEN printf (Hello world)

(6) Removed -O0 and add -O3 to the gcc options.

  • From (zero) optimization to maximum optimization
  • Increased size of file

Ending note:

Interesting to dive into using different options to see what the compiler actually does with what you code. (Many new programmers won’t understand why certain things behave the way they do, or how to speed up/slow down compile/runtime).

Lab 5 (Algorithm Selection Lab)

For this lab two algorithms are tested, signed 16-bit integers representing sound waveform samples are multiplied each by a floating point “volume scaling factor” in the range 0.000-1.000. One approach was the naive multiplication of the sample by the volume scaling factor, and the second approach was bit-shifting.

The naive multiplication is very straightforward: multiply each 16-bit integer sample by the volume scaling factor, which was represented by a floating point value. (This is the “naive” step-by-step solution).

This would be represented by your typical for loop:

float vol = 1.0;
for (int i =0; i<sizeOfSamples; i++)

samplesArray[i] *= volume;

The higher the optimization level, the slower the compile time. code size increases,  execution speed increases.

Both bitwise/naive algorithms can keep up if fed at 44100 samples per second x 2 channels (CD rate).

Both algorithms were similar in speed, with bit-wise edging out naive multiplication at high amount of samples.

Phase 2

opening a file with a + means the implementation expects read OR write to occur.

Possible idea is to  modify the implementation to NOT treat this expecting read or write UNTIL a read call is called for example.

Searched libio for modifying this, looks like fileops.c contains the code relevant to how fopen treats different modes.

since the fwrite only needs the pointer to where the buffer is valid, fread SHOULDNT be called.

(Looking in libio)

function definition is in glibc/libio/stdio.h, but actual fopen function is in fileops.c


Going to examine both fileops.c and the fwrite function

//looking at fopen function, and how it handles the “w+” case, line 268

if (_IO_file_is_open (fp))
return 0;
switch (*mode)
case ‘r’:
omode = O_RDONLY;
read_write = _IO_NO_WRITES;
case ‘w’:
omode = O_WRONLY;
oflags = O_CREAT|O_TRUNC;
read_write = _IO_NO_READS;
case ‘a’:
omode = O_WRONLY;
oflags = O_CREAT|O_APPEND;
__set_errno (EINVAL);
return NULL;
#ifdef _LIBC
last_recognized = mode;
for (i = 1; i < 7; ++i)
switch (*++mode)
case ‘\0’:
case ‘+’:
omode = O_RDWR;
read_write &= _IO_IS_APPENDING;


//line 336

result = _IO_file_open (fp, filename, omode|oflags, oprot, read_write,

Noted here that I should not change the actual functionality of the modes (i.e to append what properties to what modes), rather should change how the function that handles the input into these modes, which is _IO_file_open.

// and the omode/read_write results are entered into result, where its handled by the _IO_FILE_OPEN function

//Around line 220 in fileops.c, _IO_FILE_OPEN function handles read/write mode

_IO_file_open (_IO_FILE *fp, const char *filename, int posix_mode, int prot,
int read_write, int is32not64)
int fdesc;
#ifdef _LIBC
if (__glibc_unlikely (fp->_flags2 & _IO_FLAGS2_NOTCANCEL))
fdesc = open_not_cancel (filename,
posix_mode | (is32not64 ? 0 : O_LARGEFILE), prot);
fdesc = open (filename, posix_mode | (is32not64 ? 0 : O_LARGEFILE), prot);
fdesc = open (filename, posix_mode, prot);
if (fdesc < 0)
return NULL;
fp->_fileno = fdesc;
_IO_mask_flags (fp, read_write,_IO_NO_READS+_IO_NO_WRITES+_IO_IS_APPENDING);
/* For append mode, send the file offset to the end of the file. Don’t
update the offset cache though, since the file handle is not active. */
if ((read_write & (_IO_IS_APPENDING | _IO_NO_READS))
_IO_off64_t new_pos = _IO_SYSSEEK (fp, 0, _IO_seek_end);
if (new_pos == _IO_pos_BAD && errno != ESPIPE)
close_not_cancel (fdesc);
return NULL;
_IO_link_in ((struct _IO_FILE_plus *) fp);
return fp;

the open function is what i’m looking for in what handles the w+ mode. <- explained here

int open(const char *pathname, int flags, mode_t mode);

this is what i’m looking for, so i search for it like how I traced the other functions (grep pattern).

the actual int open function is not in libio, defined in conform/data/fcntl.h-data line 180 ?

[dleung25@betty libio]$ grep -rnw ‘/home/dleung25/src/glibc’ -e “int open”

(search results for int open function)
/home/dleung25/src/glibc/conform/data/fcntl.h-data:97:function int open (const char*, int, …)
/home/dleung25/src/glibc/io/fcntl.h:180:extern int open (const char *__file, int __oflag, …) __nonnull ((1));
/home/dleung25/src/glibc/manual/llio.texi:84:@deftypefun int open (const char *@var{filename}, int @var{flags}[, mode_t @var{mode}])

it doesn’t look like i can actually find the source code for open, strangely enough.

fcntl.h in conform/data doesn’t provide any source code

the manual (of course) only describes what it does and how it works (which is not what i’m looking for)

I’ll put that aside for now and look for fwrite specifically (fseek has to do some reading so i’ll avoid doing any changes to that), maybe i can modify fwrite to handle w+ cases specifically).

Phase 1

Working with stdio.h

Seems to be a fixable issue for specifically w+ file

Running fseek (places file position indicator x bytes away from an indicated origin) and fwrite (write data to file stream) calls doesn’t seem to be a problem.

test file
test fopen

#include <stdio.h>
int main (void){

printf(“Start Test\n” );
int fileSize = 40;
int position;
int fError;
char buffer[] = { ‘a’ };
FILE *f;
//open test_file for writing
f = fopen( “test_file”, “w+” );
//fail to open for writing
if( !f )
printf( “File failed to open\n” );
return 1;
//loop, places an a each write, moves position indicator one byte more away from start of file each time
for( position = 0; position < 1000; position += sizeof( buffer ) )
//place position indicator position bytes from start of file
fError = fseek( f, position, SEEK_SET);
//fseek fail
if( fError != 0)
printf( “Fseek failed\n” );
return 2;
//write to file
fError = fwrite( buffer, sizeof(char), sizeof(buffer), f);
//fwrite fail
if( fError != 1 )
printf( “Fwrite failed\n” );
return 3;
//close file
fclose( f );
printf(“Test Finished\n”);
return 0;

This is the following output in betty:

[dleung25@betty glibc]$ gcc -g -o test test.c
[dleung25@betty glibc]$ ./test
Start Test
Test Finished
[dleung25@betty glibc]$

Looking at the output file test_file:


looks like the expected write into this test file.

however, using the strace function(lets us check what system calls are being made by this program):


seek and write calls are expected. pointless read calls to kernel not expected, even though w+ expects that both write and read could occur


WIP on finding out where and why read is being called when sample program doesn’t request for it.

buffering logic doesnt seem to be located in the fopen.c

looking at the libio’s iofopen.c

#ifdef _IO_MTSAFE_IO
_IO_lock_t lock;
struct _IO_wide_data wd;
} *new_f = (struct locked_FILE *) malloc (sizeof (struct locked_FILE));

if (new_f == NULL)
return NULL;
#ifdef _IO_MTSAFE_IO
new_f->fp.file._lock = &new_f->lock;
#if defined _LIBC || defined _GLIBCPP_USE_WCHAR_T
_IO_no_init (&new_f->fp.file, 0, 0, &new_f->wd, &_IO_wfile_jumps);
_IO_no_init (&new_f->fp.file, 1, 0, NULL, NULL);
_IO_JUMPS (&new_f->fp) = &_IO_file_jumps;
_IO_new_file_init_internal (&new_f->fp);
new_f->fp.vtable = NULL;
if (_IO_file_fopen ((_IO_FILE *) new_f, filename, mode, is32) != NULL)
return __fopen_maybe_mmap (&new_f->fp.file);

_IO_un_link (&new_f->fp);
free (new_f);
return NULL;

_IO_new_fopen (const char *filename, const char *mode)
return __fopen_internal (filename, mode, 1);

#ifdef _LIBC
strong_alias (_IO_new_fopen, __new_fopen)
versioned_symbol (libc, _IO_new_fopen, _IO_fopen, GLIBC_2_1);
versioned_symbol (libc, __new_fopen, fopen, GLIBC_2_1);

# if !defined O_LARGEFILE || O_LARGEFILE == 0
weak_alias (_IO_new_fopen, _IO_fopen64)
weak_alias (_IO_new_fopen, fopen64)
# endif


Compiling/building glibc

Attempt to build glibc on matrix/zenit servers.

Following’s instructions on building/compiling the latest version 2.25 (

(Building without installing)

$ mkdir $HOME/src
$ cd $HOME/src
$ git clone git://
$ mkdir -p $HOME/build/glibc
$ cd $HOME/build/glibc
$ $HOME/src/glibc/configure --prefix=/usr
$ make


Ran into issues with disk quota on both zenit and matrix during the git clone process. Even with deleting unnecessary files, it seems that the quota is limited for accounts given by the school (for obvious reasons).

Attempt to install on local linux machine:

Downloaded vmware and virtual box –

problems after downloading openSUSE-13.1-DVD-x86_64 (compatible to download and compile the latest version of glibc).

For some reason, only options to install using this iso are 32-bit, no 64-bit for both.

Gave up installation of local linux machine (simply too time consuming to download iso), went for easiest option (install onto betty/xerxes).

Attempt to install on betty:

Following’s instructions on building/compiling the latest version 2.25 (

(Building without installing)

$ mkdir $HOME/src
$ cd $HOME/src
$ git clone git://
$ mkdir -p $HOME/build/glibc
$ cd $HOME/build/glibc
$ $HOME/src/glibc/configure --prefix=/usr
$ make




Open Source Software Packages

Selected packages: GCC, BASH


Testing patches for submission: 

changes to backend or c/c++ frontend – complete build of GCC and runtime libraries on at least one target. complete bootstrap on all default languages and run all testsuites. (make bootstrap; make -k check; will accomplish this)

changes to front end – perform complete bootstrap.

test exactly the change that is intended for submission

Submitting patches:

  • Description of problem/bug and how patch addresses this
  • Testcases
  • ChangeLog
  • Bootstrapping and testing
  • the patch itself

bundled into mail message and sent to appropriate mailing list(s)

GCC 6 Release Series (6.2) Patch

Participants and role

Problem Report and users involved listed here ->

issues discussed (and resolved)

aside from problem report above, specific targeted issues:

  • Support for –with-cpu-32 and –with-cpu-64 configure options have been added on bi-architecture platforms
  • Support for the SPARC M7(Niagara 7) processor has been added.
  • Support for the VIS 4.0 instruction set has been added