A great start to the New Year

Happy New Year! FlingOS had a fantastic year in 2015 and 2016 promises to be another exciting year of big progress. Here’s some of the highlights from 2015 and our plans for 2016.

Happy New Year! FlingOS had a fantastic year in 2015 and 2016 promises to be another exciting year of big progress. Already in the first 24-hours of 2016 we’ve had over 12,000 views of our website (and that’s before we count the blog, community forums, codebase and YouTube videos!) Here’s some of the highlights from 2015 and our plans for 2016.

Our first summer intern, Roland Baranyi, joined us to add MIPS support to our compiler and created our FlingOops testing kernel. Sponsored by Imagination Technologies, Roland helped reform the Drivers Compiler into a cross-platform compiler and added the MIPS target architecture library. We also created a basic test kernel for the Creator CI20. More on where this is headed in 2016 further down!

On September 17th we launched a new FlingOS – a total new look for our website and a lot of content and progress to back it up. We added over 30 new articles and published our 10-video Getting Started tutorial series on YouTube. Since launch we’ve had thousands of views per month of the articles and just over a thousand per month of our videos.

On top of all that, I’ve been presenting about FlingOS and the educational problems the project is solving, in and around Bristol. I’ll be giving another talk in Hereford in February 2016. I also ran a series of lectures and workshops in the University of Bristol which were a resounding success.

In 2016 we’re aiming to build on our progress but also to better align what we’re producing with what industry wants and education needs. To achieve this, we’ll be creating an OS dev starter kit, aimed at A-Level and first-year university students.

The kit will be based on a embedded device to align with what new OS dev industry is aimed at (and which we hope Imagination will support us with using the Creator CI20 or their latest CI40 board). We will be creating a complete self-teach course and lecture course (with slides, notes, examples and exercises) and selling the complete kit at low cost. But as always, our main codebase a core articles will remain free and open-source.

To succeed in producing all this, FlingOS needs sponsorship (or investment) to hire 3 to 5 interns next summer (myself included!) and to cover our basic costs. We’re still looking for sponsors, investors or people to donate! Please head to our Sponsor page to find out more if you think you can help.

2016 promises to be a year of big progress, with lots of new articles, a solid codebase and more great tutorial videos. We hope you’ll join us on our journey and support us by sharing, liking and Tweeting (@Fling_OS).

Many thanks to everyone who helped make 2015 a solid milestone in the project’s development!

Following up: “Avoiding the UK IoT Disaster” (BrisTech, 2015-12-03)

I thought I’d take the time to write a short blog post following up on my presentation that I gave to BrisTech the other night (2015-02-03) titled “Avoiding the UK IoT Disaster”. In the talk, I highlighted not security or technical problems with IoT but a key educational problem that will be putting our industry at serious risk: Low-level development is not being taught in schools and is barely taught at Universities. Slides are available here.

During the presentation I highlighted a couple of possible problem-applications, where IoT devices are being programmed using high-level code (often Python) on top of an embedded Linux stacks. Such stacks and programming in Python allows rapid development and ease-of-update in future (which is useful/key for security) but sacrifices hardware efficiency and, even more so, battery life.

The specific example that I brought up was an IoT thermometer programmed in Python with wireless connectivity. I stated during the presentation that such a setup was not a good idea because thermometers were relatively simple devices that didn’t require such a complex stack and also that, as a device installed in a home, you would want a decent battery life.

This example caused some controversy and was the subject of a lot of the discussion at the end of the talk. One member of the audience, who works in the IoT industry, gave an interesting piece of analysis which is where I would like to start. His key point I feel worth highlighting is that, in his work, he sees three types of IoT device:

  1. Long-term, hard-to-replace devices, such as tremor and stress sensors built into bridges which often can’t be replaced, need to be extremely reliable and last a very, very long time (relative to standard technology cycles).
  2. Long-term, easy-to-replace devices, such as the thermometer example I gave earlier. These are devices which the user may want to have for a long time but they are easily replaceable.
  3. Short-term devices, where short-term is roughly the length of a standard technology cycle, 18 months to 2 years. These are devices which we would only expect to last that length of time before they need replacing.

The design approach for type (1) devices is relatively straightforward to analyse. These are devices which we may never be able to replace and which need to last a long time thus battery life is a top priority. So is reliability and security, which is easier to assure when there is less hardware and less software to test. Minimising the amount of hardware and minimising the amount of software executing for any length of time is key to increasing battery life. This is only possible using low-level software where modules can be stripped to a minimum and Python certainly doesn’t come under the heading of “low-power”.

The design approach for type (3) is arguably equally simple because the developers are likely to need the device to come quickly to market and to rapidly develop the next version and the version after that. This means Python and embedded Linux, which has regular security updates, and keep be developed and reasonably tested pretty quickly, makes sense.

The design approach for type (2) is what caused the most debate. Arguably, it is better for the user if the battery life is longer since, for a long-term device, it will be cheaper to not have to replace the battery or entire device so often. However, since the device can easily be replaced, it will probably not be an expensive, critical device with high per-unit margins (or at least, replacing the battery will be cheap so in the long-term, the per-unit profit is massively affected by the initial and maintained development cost). So using entirely low-level favours battery life but increases development cost. Using high-level significantly reduces development cost but also battery life and (possibly) user satisfaction. Which side of the line you go for probably depends on the specific device. Popular opinion at the talk was that Python for a thermometer was reasonable and on reflection I am inclined to agree.

In conversation after the event, Adam B. from Simpleweb showed me an unusual (but very cool) little device called an iBeacon. For those unfamiliar with iBeacons, they are small, thumb-sized low-energy Bluetooth devices which you can track the location of. This allows tracking people as they move through a shop or similar areas. The devices shown to me were sealed units, extremely small (so no space for big memory chips and processors nor heat dissipation) and without a replaceable battery. But per-unit they are pretty expensive and typical use cases require longevity. Thus software for such a device is likely to need to be entirely low-level, for battery life, despite the fact that the devices themselves are easily replaceable. This is an example of a type (2) device better suited to C-based dev than Python-based dev.

This example brought us to think about the development process for the devices. Adam, Roger Shepherd, a few others and I discussed the following areas which we agreed make sense as ideas but need significant work to improve or integrate with standard IoT development practices:

  1. As an example, using a Creator CI40 for rapid development (in Python or otherwise), experimenting with features, nailing down an exact design and then creating the final product in proper low-level code (if necessary). In an ideal world, it would be an easy, automated process to go from Python to pure low-level so that any device can be developed for best battery life. As it stands at the moment, this isn’t possible, which brings me on to points 2 and 3.
  2. GCC and similar C compilers are a nightmare to set up. Which makes development with them flaky and hard to do, all of which contributes to extended development time.
  3. Also, the C language and C libraries have poor modularisation in most code bases and there is a distinct lack of open-source, free systems for C-based package management. Thus, dedicated IoT software is either written from scratch every time or bought (at great expense) from another company (but often the bought libraries aren’t stripped down to just what is required). Thus low-level development for IoT (in C) is currently very expensive whereas in Python, there is great package management and modularisation.

Furthermore, as highlighted in my talk, C development is only going to get more expensive, as knowledge and understanding of low-level development by graudates entering the IoT industry decreases and the onus falls on companies to train new developers.

In conclusion then, while the example I gave during the talk was not the strongest technical example, there is still a strong case for teaching low-level development in schools and universities (though not to the exclusion of everything else). Furthermore, if  as an industry we could perfect a technique for high-level prototype development with easy transition to production low-level code, that’d be great. It would cut development costs, improve products (and reduce hardware costs). In the meantime, we will have to rely on team leaders to make an informed per-device choice of the tools and language used.

My internship: the experience and achievements

In this last post of the summer, which also marks the end of my summer internship, I would like to write a bit about my experience working for FlingOS™. Also, I would like to give you a summary of what has been achieved in the last nine weeks.

The experience

The internship has been a great experience. I have learnt a great deal about the x86 and MIPS32 processor architectures. Learning about these two very different designs alongside each other was truly valuable in recognising the merits of each approach and in learning about low-level development in general. Working on the IL to assembly conversion was fairly challenging but nevertheless fun. I also had a chance at writing and debugging substantial amounts of assembly code, which definitely worth the effort. Using Visual Studio and C# has given me a valuable experience in using the .NET framework as well. The internship provided me with skills I wouldn’t have learnt on my undergraduate course because most of the topics we covered during the summer are either not included or not discussed in such detail. So overall the experience with FlingOS™ has been a great addition to my studies. Although I was slightly intimidated by the amount of new information at the start, I think I managed to do well, with thanks to Ed for his patience and good teaching skills.

Achievements

During my internship, I helped to produce the resources for the upcoming tutorials, we achieved compiler support for MIPS and an extensive verification kernel has been created. The new kernel turned out to be very useful as we uncovered some errors from earlier implementations that caused problems in the compiler. The testing kernel, on which we have been working the last three weeks, has been a supreme success. It was great to see that the implementations worked as intended and also to detect a few corner-case bugs. To sum up, we not only have achieved what we have set out to do at the beginning but were also able to produce the verification kernel, which will definitely help future developments and improve the stability of FlingOS™.

Future plans

This week, my summer internship has come to an end, however, this does not mean farewell. Our plan is that I will continue to work for the project, during the coming academic year, as much as possible. I will also try and post here frequently to keep you up-to-date on what is happening here at FlingOS™, so you will definitely be seeing me around in the future.

Thank you for your interest in the project and thank you FlingOS™ for the great experience, it has been a great summer.

See you soon…

Roland

Cross-platform compiler verification kernel

The compiler verification framework I mentioned in last week’s post is currently under development and is due for completion by the end of next week. Of course, compiler testing never really reaches a completed state; new features can be added or old ones may require verification for a new use-case. So what I mean by completed is that by next week we will have included all the test cases we originally intended to include, which were decided upon after some thought and reasonably made assumptions. In the future, it is very likely that new test cases will be added to the framework. Our intended test cases will cover every expected (and many unexpected) use cases of the IL ops.

FlingOops™

The testing framework verifies the correctness of the compiler using behavioural testing, which essentially tests whether the IL to ASM conversions were correct. It is a behavioural testing framework because it tests the functionality of the compiler by testing the behaviour of the output and thus whether the compiler compiled correctly. It is also cross-architecture/cross-platform because it can be used for testing both the x86 and the MIPS architecture libraries (and any others we may add in the future).

The test kernel is called FlingOops™ and consists of a wrapper framework (with variations for different target architectures) and a long list of method calls, where each method contains a particular test relating to an operation as well as a message to the console stating whether the test case passed. Particular attention is paid to testing signed operations and the ones that require the handling of 64-bit values. To verify that signed and 64-bit values are calculated correctly, depending on the operation tested, we need to make sure that the carry- and borrow-bits are handled correctly as well as overflow happens as expected. There aren’t special carry and borrow flags on MIPS, so these have to be implemented by the compiler and therefore it is crucial to test the correctness of these implementations. The testing is going well and it is great to see that so far, all the test cases passed with the exception of a couple of small errors which were in the x86 target library. Thankfully these issues were fixed quickly as we were able draw from the knowledge and experience gained form working with MIPS. On the plus side, the detection of bugs in the x86 build demonstrated that our testing works!

For the last week

For the last week of my internship, our plan is to finalise the testing framework. I will post another article next week to confirm the completion of testing and to review my experience working for FlingOS™.

Thanks for reading! See you next week,
Roland

MIPS compiler support completed

For the last three weeks, we have been working towards completing the MIPS target architecture library. Finally this week, all the IL to ASM conversions have been added to the compiler. The implementations have been tested and we can confirm that we have compiler support for the MIPS32 processor architecture. So how did the testing go?

Testing

Some operations are fairly simple to test along the way, such as Add, Sub, Mul, Div, Shl etc., i.e. operations that use value types, since all we need to check is the result produced by a particular IL operation. The correctness of these operations can easily be verified without the need for a complete library. This is not the case for more complex operations, such as the ones related to objects, arrays and strings, i.e. reference types.

In order to check whether reference types function correctly, many different IL operations have to be implemented (NewObj, InitObj, Ldobj, Isinst etc), so we did not start testing these until the whole library was completed. The functionality of objects can be tested by creating a test class with some methods and fields within it. If these types initialise as expected then we can try to create a new instance of the class. If all these have been a success then we can test whether values passed to the class from outside return to the calling environment correctly. The testing of arrays and strings is a little more straight forward. In order to test arrays and strings, we can try to declare a new instance of these types and apply some basic array or string related manipulation to them and inspect the results.

During the testing process, some issues were uncovered. These issues were mainly related to the differences between the x86 and the MIPS architectures. On MIPS, only values contained in registers can be pushed onto the stack, while x86 lets the programmer push labels and immediate values directly, this is not available on MIPS. This is due to the CISC vs RISC architecture differences between x86 and MIPS. On MIPS, if we want to push a label onto the stack, we must first load the address of that label into a register, using the La instruction and then push that value from the register onto the stack. The same is true for immediate values; first we must move the value into a register then finally push that register onto the stack.

In many cases it is required to push the value zero onto the stack, but to do this we do not need to move zero to a general purpose register because MIPS has a $zero register which is fixed to the value zero. On MIPS there isn’t a Push or Pop assembly instruction as on x86, we need to simulate these operations with store/load instructions followed by adding to the stack pointer register using the Addi instruction.

There is another important thing to keep in mind when programming on MIPS. Data stored in memory must be aligned in such a way that an object’s address must start at a value that is some multiple of their own size in bytes, e.g. a 4-byte word must be 4-byte aligned but a single byte needs no alignment (a.k.a. single byte alignment). So if we want to load from/store to memory using an offset, we need to either make sure the data is aligned or access the data in multiple parts to avoid the alignment issues.

We never need to worry about this when accessing the stack through $sp or $fp since the compiler guarantees these are always 4-byte aligned. For other memory accesses, the FlingOS Compiler does not require memory to be aligned (for compatibility with other architectures). Instead, it uses runtime checks to load memory from or store memory to unaligned addresses. This means C# code can be written more generically, with less for the programmer to worry about but does create a small performance hit at runtime.

What’s next?

What we would like to do during the remainder of this summer is to add a unit testing framework to the project that formalises test cases for current and possible future architectures. This way we can ensure that the current supports for x86 and MIPS are stable and also, if the project is to be extended with a new target architecture library, then implementing that library can be done in an efficient way which is easily testable.

Thank you for reading this and if you are interested in the upcoming Launch Event then please subscribe on the FlingOS homepage.

See you soon…

Roland

Adding 64-bit values using 32-bit registers

Last week, I published a post discussing how to left shift 64-bit values using the available 32-bit sized registers on MIPS. This week, we thought I should do something similar which is to provide you with another article that describes a technique that solves another interesting problem. This time I will discuss how to add two 64-bit values. This may not seem difficult at first, however we must remember that MIPS processors do not have carry flags. We need the carry flag (or something equivalent) to preserve the result of an intermediate computational step. What we have to do is somehow simulate ‘Add with carry’ without the carry to allow for the addition of 64-bit integers using 32-bit registers.

Adding two 32-bit values is trivial since the operands and the result can be contained in single registers, so here we can just use a simple ‘add’ instruction, therefore carry is not needed. The sizes of the operands for the addition instruction are either 32 or 64 bits but in order to perform the addition, the sizes must match. An exception should be raised if the sizes are not the same. Before I discuss the method for adding two 64-bit values, let’s remind ourselves how signed and unsigned values are represented in 32-bit binary. The sign extension does not have an effect on the result of the computation due to the way two’s complement works, I only include it for completeness.

1

In the case of the unsigned values, the number of possible integer values goes from 0 to 232 and they are all positive, however if we want to represent signed integers in 32 bits then we need to use one bit, the most significant bit, to represent the sign, so we can represent both positive and negative integers in the range from 0 to 231. A negative value is derived such that you take a positive integer then you apply the method of two’s complement to it. Let’s see an example how to convert a positive number to its negative equivalent in 8-bit binary representation.

Two’s complement:

0000 1010   # = 10
1111 0101   # invert bits
0000 0001   # add 1
1111 0110   # = -10

First you take the integer as a bit string then you invert all the bits and finally you add one to the result. Two’s complement can also be used to convert a negative integer to positive.

Adding 64-bit operands

The two 64-bit integer operands have to be stored in two registers each. The first operand is stored as $t1:$t0, while the second operand is loaded as $t3:$t2.

1.      Add low bytes

The first step of this algorithm is to add the two low bytes into $t4 which we will use as a temporary storage for this partial sum.

addu $t4, $t0, $t2

We use the unsigned operation because the low-bytes do not have a sign-bit as the upper bit! We also need a way to know if there is a carry from this partial calculation. For example if we add the following two binary numbers we get a result that cannot be represented in a single register (here I assume that the registers are 4 bits in size), i.e. there is a carry bit.

1111 + 0001 = 10000

What we can do here is to store the carry bit (the 5th bit if you like) in a temporary register in the next step.

2.      Simulate the carry flag

At this point we are going to use the ‘sltu’ instruction to store the carry bit in $t5.

sltu $t5, $t4, $t0

(Alternatively: sltu $t5, $t4, $t2 : either will do)

Here the ‘sltu’ instruction compares $t4 (sum of low bytes) with either of the individual set of low bytes ($t0 or $t2). The computation generated by the ‘sltu’ instruction goes like this. If $t4 < $t0 or if $t4 < $t2 then $t5 = 1, else $t5 = 0. It does not actually matter which set of low bytes we are using for the comparison because $t4 will always be smaller than any of the individual low bytes if there is a carry. But let me show you with some examples:

  1. $t0  =  0000
    $t2  =  0000
    $t4  =  0000       $t4 = $t0 = $t2         -> $t5 = 0

Here the computation does not generate a carry. We can see that $t5 has been set to 0 as required since $t4 is not less than either $t0 or $t2.

  1. $t0  =  0000
    $t2  =  1111
    $t4  =  1111       $t4 > $t0, $t4 = $t2    -> $t5 = 0

There isn’t any carry in this example either because $t4 is not less than either $t0 or $t2.

  1. $t0  =  0011
    $t2  =  0110
    $t4  =  1001       $t4 > $t0, $t4 > $t2    -> $t5 = 0

No carry because $t4 is not less than $t0 or $t2.

  1. $t0  =     0001
    $t2  =     1111
    $t4  =(1)0000       $t4 < $t0, $t4 < $t2    -> $t5 = 1

Now, in this example we do generate a carry, so let’s see if the method works. Remembering that $t4 is (for these examples) 4-bits in size, so the 5-th bit (the ‘1’) is lost: $t4 = 0000 because the carry bit is lost so, so $t4 < $t0 and $t4 < $t2 therefore $t5 is set to 1.

  1. $t0  =  1111
    $t2  =  1111
    $t4  =(1)1110       $t4 < $t0, $t4 < $t2    -> $t5 = 1

The last example works for the same reason as the previous one did.

Using my diagram above to visualise how numbers work, it is possible to see that no two numbers added together can ever wrap round over themselves, so the result will only ever be greater than either of the operands if there was no carry. The number system forms a cycle.

3.      Add high bytes with carry

Now, we need to add the two high bytes ($t1 and $t3) plus any carry that is left from the addition of the low bytes and store the result in $t1.

addu $t5, $t5, $t1      # $t5 = carry + $t1(high bytes of operand 1)
addu $t1, $t5, $t3      # $t1 = $t5 + $t3(high bytes of operand 2)

Finally we move the sum of the low bytes into $t0.
move $t0, $t4           # move partial sum to $t0

At the end of the process we have a result as a 64-bit value stored as $t1:$t0.

I hope you found this article helpful. See you soon…

Roland

 

Left shifting 64-bit values using 32-bit registers

Yesterday, while working on the implementation of the left shift IL operation (shl) for MIPS, we came across some challenges related to shifting 64-bit values using 32-bit registers. The solution is a bit tricky, so we thought it would be useful if I produced an article discussing how it can be done.

In order to shift a string of bits, we need two operands; the data itself and a value specifying the distance we want the data to be shifted. The shift value can either be a constant or a value stored in a register. Although in this article we assume that the values are in registers, the same technique can be adapted to be used with the assembly instructions that use constant values. There are four different cases we need to consider in terms of operand sizes:

  • 4-byte data, 4-byte shift value,
  • 4-byte data, 8-byte shift value (this is unsupported by C#),
  • 8-byte data, 4-byte shift value,
  • 8-byte data, 8-byte shift value.

The 4-8 case is unsupported by C#, so in this scenario the compiler throws an exception. The 4-4 case uses a single left shift assembly instruction (sllv) so there isn’t much explanation needed there.

8 – 4 case

In the 8-4 case, the 8-byte sized data is shifted by a value that is represented by a 4-byte binary number. Since we only have 32-bit (4-byte) registers, we need to store the data in two separate registers. One register contains the low bytes ($t0) while the other contains the high bytes ($t1). In this example, I store the shift value in $t2.

1

2

Consider a 1-byte left shift, i.e. $t2 contains the value of 8. Now we have a problem because the top byte of $t0 must replace the bottom byte of $t1 as well as the rest of the data must be shifted correctly. So how do we shift an 8-byte value using 4-byte registers? I will show you how by going through some examples. There are two variations of the 8-4 case; one where $t2 < 32 and another where $t2 >= 32.

The method ($t2 < 32)

Let’s say we want to left shift this data by 1 byte (1 byte can be represented by two hexadecimal digits):

3

Somehow we want the result to end up looking like this:

4

1.      Left shift high bytes by $t2

First, we want to left shift the high 4 bytes ($t1) of the original data by the value carried by $t2, which is 1-byte. The low 4 bytes remain unchanged for now. So we have:

5

2.      Right shift low bytes by (32 – $t2) into temporary

The next step is to logical right shift (not arithmetic right shift!) $t0 by (32 – $t2) into a register which we will use as a temporary storage, say $t4. Since $t2’s value in bits is 8, we right shift $t0 by 24 into $t4. This way we get the proportion of $t0 which we then want to copy into $t1. $t0 and $t1 remain unchanged at this point.

6

 

3.      OR temporary with high bytes

Here we can combine $t1 and $t4 using the logical OR operation to get the correct result for the high 4 bytes which we store back to $t1.

7

 

4.      Left shift low bytes by $t2

There is only one thing to do, left shifting the low 4 bytes ($t0) by $t2.

 

8

Now the algorithm is complete. If we compare this result with the desired result above, we can see that they are identical.

The method ($t2 >= 32)

If $t2 >= 32 then we are left shifting the data by 32 or more bits which means that the least significant bit (little endian) of the data is pushed beyond the low bytes into the high bytes. In all cases where the shift value is greater than or equal to 32, $t0 ends up filled with zeros and the content of $t1 are lost completely. But in what form does $t0 take over $t1? Let me show you step-by-step as before. Now let’s assume that we are left shifting by 40 bits and we have the same original data as before.

 

3

But this time we want the result to be this:

9

 

1.      Move low bytes into high bytes

We copy the contents of the low bytes into the high bytes. We do this to save the contents of $t0 into $t1. The original data becomes:

10

 

2.      Fill low bytes with zeros

Since the data is pushed all the way beyond the low bytes, we can fill the $t0 with zeros.

11

 

3.      Left shift high bytes by ($t2 – 32)

The final step is to left shift $t1 by ($t2 – 32) which is 8 in our case. So the desired result is achieved as expected.

12

 8 – 8 case

In the 8-8 case, both the data and the shift values are 8 bytes in size. There is one important observation we must make; shifting a 64-bit value by 64 bits or more is pointless since the result will always be zero. We also know that the number 64 is represented by this binary number: 0b0100 0000 which can easily be contained in the low bytes of the shift value ($t2). Actually any non-zero value beyond the 6th bit would yield a zero result but let’s just consider 4 bytes to be the smallest size we can manage.

 1 

13

To conclude, if $t3 is non-zero then the result of the left shift will definitely be zero, while if $t3 is zero then we can proceed the same way as in the 8-4 case by simply ignoring $t3.

I hope you found this article helpful. Please leave a comment if you think I missed something or if you have anything to add and I will do my best to respond.

See you around.

Roland

 

World first: C# kernel running on MIPS

Exciting news today as we announce our working kernel for MIPS based on the Creator CI20 board.

To the best of our knowledge, this is the first time in the world that anyone has got a C# kernel or operating system (of any form) working on a MIPS processor.

How did we do it?

We started the week by setting up the environment in which we were able to send a kernel binary file to the CI20. To see how this setup process is done, please read Ed’s post published earlier this week. After the connections were established we started to implement the necessary IL operations for the MIPS architecture to get the test kernel working.

The conversion of IL implementations from x86 to MIPS has been relatively smooth, except for the fact that in the MIPS architecture data stored in the memory must be half- or full-word aligned. This caused some initial headaches, however, Ed was determined to get to the bottom of the issue and in little time the problem was solved.

Other differences that required a lot of study and thought were related to the assembly code syntax for MIPS (GNU assembler – GAS) and the instructions available (reduced instruction set for MIPS). Although MIPS is a RISC architecture, meaning that more instructions must be used to perform the equivalent computation compared to x86, there are 16 general purpose registers available (as opposed to 4 on the x86) which makes implementation much easier. Furthermore, the programmer can also make use of pseudo-instructions which speed up implementation.

What does it do?

So what does the test kernel actually do? The answer is both not that much and quite a lot.

Although the functionality of the kernel is limited to changing the colour of the on-board LED and reading/writing characters from/to the UART ports, the fact that the kernel does that much proves that the FlingOS compiler is in a stable and solid state. It is a great proof of concept and initial step from which the new MIPS kernel can progress.

This has been a very exciting week and we are looking forward to completing the target architecture library and expanding the test kernel. In a few weeks time we will have a fully functioning IL compiler for MIPS.

Can I try it?

We’ll be releasing a compiler package and stable copy of the test kernel in the next month. We’d like to expand our compiler and proof-of-concept, test kernel a bit before releasing it to the wild.

Keep an eye on this space! Please ask questions below.

Ed and Roland

Stage 2 : Boot a custom OS

Earlier this week I tweeted that “Stage1: Boot a different OS – complete” meaning that I had successfully booted an alternative OS on the Creator CI20s (which were kindly provided by our sponsor Imagination Technologies®. Well, today I succeeded in booting a very basic custom operating system on the CI20s, so here’s how I did it.

For starters, I downloaded and installed the current compiler toolchain for Windows – Sourcey Codebench for MIPS available here. I installed mine to a non-standard directory but it works just fine. We’ll come on to how to use it later.

I also downloaded and install Putty and Serva – both of which are necessary tools. Putty provides a console interface to the serial connection required to talk to the CI20’s U-Boot bootloader. Serva provides an easy way to set up a TFTP server on Windows. Both of these tools are free. Again, we’ll come on to how to use them later.

Lastly, I had to buy one small (cheap) bit of extra hardware – a USB to UART converter. Be aware that there are two chips widely used to produce these devices. One of them only supports Linux and version of Windows 7 and earlier. The other chip supports Linux and all current versions of Windows – so make sure you get the right one! I bought one for Windows 8.1 (i.e. the second type of chip) from Amazon (with one day delivery on a Sunday no less!). I ordered from 3C4u who use Amazon. (The first time I tried to order two of these the package never arrived. I re-ordered and they were delivered fine. Amazon gave me a refund and Prime subscription extension for the first order so I’m not complaining too much!)

For the custom OS I wanted to try out something which I knew worked. So I went online and found Lardcave.net’s great series of tutorials on writing a custom CI20 OS. I cloned the Git repo and started following the instructions. For the USB to TTL chip I have, the tutorial is correct, you need to connect RXD on the converter to TXD on the CI20, and visa-versa for TXD on the converter and RXD on the CI20. To avoid having to use the power cable, you can also attach the 5V pin to abny of the CI20’s 5V_IN pins on the primary expansion header. The board will power on as soon as you connect the 5V pin so hold off until later for that! You’ll also need to connect an Ethernet cable to the same hub or switch your PC/laptop/WiFi hub is connected to – this will be so the CI20 can connect to the TFTP server.

After cloning the Git repo I ran across a few issues. The Lardcave.net tutorials were written for Max/Linux and for if you compile GCC yourself. Since I installed Sourcery CodeBench , the Makefile was not set up to compile properly. I eventually worked out how to it so it works properly. Here is a copy of my make file:

AS=mips-linux-gnu-as -mips32
CC=mips-linux-gnu-gcc
LD=mips-linux-gnu-ld
OBJCOPY=mips-linux-gnu-objcopy
CFLAGS=-Os

OBJS=start.o main.o

hello.bin: hello.elf
 $(OBJCOPY) -O binary $< $@

hello.elf: $(OBJS)
 $(LD) -EL -T linker.lds -o $@ $+

%.o: %.[Sc]
 $(CC) $(CFLAGS) -EL -c -o $@ $<

clean:
 rm -f *.o *.elf *.bin

I combined this with a simple batch script called build.bat in the same directory as the Makefile which allows me to specify the path to Sourcery CodeBench’s bin folder, instead of other version of GCC which I have installed. The batch script looked like this:

@echo off
SET BD=C:\Users\Ed\Documents\Coding\C\2015\MIPS\Compiler\bin
%BD%\cs-make
pause
@echo on

Where BD is set to the path to Sourcery CodeBench’s “bin” folder.

Having compiled the “hello.bin” file I proceeded to set up Putty and the TFTP server. Here are a series of images I took showing the process. By the time I was done entering the commands into Putty (also shown below), the CI20 showed the nice, purple LED as expected.

2015-08-02 - Putty Config
Putty configuration

For the Putty configuration shown above, remember to update the COM port name to the name of the COM port device on your computer. This can be found by opening Device Manager and looking under the Ports node of the tree.

Serva config
Serva config

For the Serva configuration shown above, remember to update the “TFTP Server root directory” to the same folder as your “hello.bin” file is in.

2015-08-02 - Putty Console
Putty console

For the commands to U-Boot, remember to replace the serverip “192.168.0.6” and the ipaddr “192.168.0.20” with values for your network. The IP should be the IP address of the computer which is running Serva (which can be looked up by doing “ipconfig /all” in a Windows command prompt on the server computer). The “ipaddr” can be any value you like but the first three parts must match the IP address of your server.

2015-08-02 - Serva Log
Serva log

If the boot completes successfully, you should see a Serva log similar to the one above. The LED on the CI20 should turn purple as shown below.

Final result
Final result

It’s all about the compiler

Welcome back!

This week our focus was on the FlingOS compiler. We started with an introduction to the compilation process and the compiler itself. We also looked at the CI20 board and booted it up to observe its capabilities of running a full scale Linux operating system on it. This was very exciting!

Introduction to the compiler

I had spent some time learning how the intermediate language generated by MSBuild is translated into architecture specific assembly, especially how calls to methods are handled during this translation. I have observed the output of simple methods, such as adding two integers or displaying colour and text on screen for this purpose. I learned that understanding this process in the context of stack operations is particularly important, such as how arguments to methods and local variables are handled as well as how return values are passed back to the caller and how space is allocated for those return values.

After this introductory session, we jumped right into development tasks. This was related to shifting some architecture specific parts of the compiler into the x86 target architecture library. I learned that it is not desired to have a compiler which is architecture dependent because you may want to compile to different architectures using the same compiler. A generic compiler will make life easier when we port FlingOS onto MIPS.

Making a generic compiler

The task of shifting assembly generation from the compiler to the x86 architecture library has proven to be more difficult than I expected. I realised that I did not have the necessary insight into how the compiler functions at deeper levels. However, with some help from Ed, I was able to complete the tasks. As it turned out, I wasn’t actually that far off from the correct solution. Nevertheless, I felt that I was thrown in at the deep end, and since I almost drowned, Ed decided that I should spend some more time studying the compiler in more detail. He kindly produced a document explaining the structure of the compiler and how the different classes and methods are involved in the compilation process. I have been studying this document along with the code itself to gain a better understanding. By the end of the week, the compiler has been changed to a fully generic version and the MIPS target architecture library has been also added to the project by Ed. Although the actual implementations are yet to be completed in the coming weeks.

Plans for Week 4

Next week, I am going to be continuing to solidify my knowledge of the compiler and soon we will begin working on porting FlingOS onto the MIPS architecture.

See you all next week.

Roland