How to be negative and stay positive / Computer Science / Forums

Forums

4hv.org :: Forums :: Computer Science

« Previous topic | Next topic »

How to be negative and stay positive

Move Thread

LAN_403

Bjørn

Sat Aug 01 2009, 10:44PM

Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058

Steve mentioned that PIC microcontrollers have a stupid instruction set. I want to show that if you don't like the instructions you don't have to use them at all.

For example on the ARM processor you can do every arithmetic and logical operation using nothing but the subtraction instruction. To access memory you need load and store instructions too but you only need to learn 3 instructions well to make any program. All other instructions are just there to make things simpler or are just an artifact of the hardware.

Addition can be done by subtracting the negative: a = a + b can be done as a = a - (-b)
Example: 34 + 7 = 34 - (-7) = 41

Left shift is the same as multiplication by two, which is the same as adding a variable to itself. NOT can be done as: -1 - a. Some things are more involved, like right shifts, since the carry propagates from right to left there is no fast way to do a right shift. It can be done by making a left rotate and run it n - 1 times where n is the number of bits in a register, then remove the most significant bit. (There are faster ways, but still slow) Often it is much faster and simpler to modify the algorithm to do the same job with left shifts instead.

This example shows a proper function with a loop and return from a call, all done with subtractions. It does an often used operation, bitwise AND. The input is in r2 and r3, the output ends up in r2.

AND:	
	sub	r0,r0,r0	; r0 will always contain 0 from now on
	sub	r1,r0,#1	; r1 will always contain -1 from now on

	sub	r10,r0,#32	; R10 = 32
	sub	r10,r0,r10

loop:	
	subs	r2,r2,#0	; check most significant bit of r2
	submis	r3,r3,#0	; if 1 then check most significant bit of r3
	
	sub	r5,r0,r2	; r2 <<= 1 (r2 = r2 - (-r2))
	sub	r2,r2,r5

	submi	r2,r2,r1	; if both = 1 then r2 = r2 + 1
	
	sub	r5,r0,r3	; r3 <<= 1
	sub	r3,r3,r5
	
	subs	r10,r10,#1
here:	
	subne	pc,pc,#((here - loop) + 8)
	sub	pc,r14,#0

So what is the practical use of this? Not much, except to show that at the bottom all assembly programming is really the same, it is about stringing together very simple operations to build complex functions. The principles are exactly the same if you have 3 possible instructions or 300 so good programmer should be able to do well on all architectures.

Even if I can do well on a PIC I can do ARM several times faster, it seems to be mainly because it harder to make a mental model of the PIC with the banking and 8 bitness (some times 7 bits) more than a bad instruction set.

Steve Conner

Sun Aug 02 2009, 09:54AM

Registered Member #30 Joined: Fri Feb 03 2006, 10:52AM
Location: Glasgow, Scotland
Posts: 6706

I thought the ARM only had those 3 instructions anyway

Nowadays no-one is expected to program in assembler, so instruction sets are designed with a view to being compiler-friendly. You just make a version of the Gnu C compiler for your new processor and then shove it out the door. Bjorn's 3-instruction set (well actually it's more like 5 as he used test and branch) might be quite reasonable, if it could execute them really fast.

Unfortunately the PIC instruction set is neither user friendly nor compiler friendly, assembly and C users seem to curse it equally. The new 24- and 30- series PICs might be somewhat better.

I agree that the problem with (16-series) PICs is mostly due to the banked memory. It's an 8-bit processor, so it struggles when asked to address more than 256 bytes of memory. Except it's not really 8 bit, it's Harvard architecture with a 14-bit program memory bus. The instructions are all 14 bits, but some of the branch instructions only have room for an 11-bit address. So the program memory ends up divided into 2K-word pages, and it's easy to jump into the wrong one. Computed gotos are even worse as they only have room for an 8-bit address.

Bjørn

Sun Aug 02 2009, 11:48AM

Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058

So what really is an instruction? It is not an easy question to answer. You can argue that it is the specific bit pattern that is loaded at regular intervals to control the functions of the CPU.

In that case the ARM has 2^32 possible instructions since it has a 32 bit instruction word. Some bit patterns will cause an abort or illegal instruction trap but most of them will do a valid operation.

Some will do the same thing, are they different instructions or the same?
- mov r0,r0
- sub r0,r0,#0
- add r0,r0,#0
they all have exactly the same effect on the state of the CPU but with different bit patterns.

So different "instructions" can be interchangable and on some CPUs identical "instructions" can do different things depending on some state of the CPU set by another instruction, so that definition is quite strained.

On a hardware level an instruction might not even be designed or even use any transistors, it might just be a useful artifact. On the ARM, sub, mvn and bic are all the result of the same 32 eor gates inserted into the datapath to make a controllable inverter to do subtraction by using the adder.

To make it more interesting program and data is not really separate so there is no case of the program always controlling the processing, the data can be virtual instructions and control the execution just as well. So viewed on a higher level, program and data are interchangable.

The definition that works best for a programmer is that what feels like the same instruction is the same instruction because it fits the mental model of the CPU. This is why it often helps to understand something about the internal working of a CPU. If you try to design a very simple CPU you will most likely be a better PIC programmer because then you suddenly understand why the instruction set is how it is. Pure C programmers are missing out on all this so if they get into real trouble they sometimes have problems getting out because they can't model as efficiently what is going on the lowest level.

The ARM instructionset is optimised for humans, not for compilers, the reason is that it was found that even if compiler writers ask for certain instructions, the compilers never use them. In the rare cases they are used it generally makes the code longer and slower.

50 years ago there was a one to one relationship between the higle level code and the instructions emitted by the compiler so there was a significant advantage to have instructions that mirrored the high level language completely. It turned out that it was a very inefficient way to do it and today no usable compiler works that way. There are exceptions, like some Forth compilers but they are not designed to generate efficient code anyway,

The conclusion is that my code uses more than one instruction by one definition but by the most relaxed and useful definition it still uses only one instruction. Each ARM instruction is a complex instruction that does one arithmetic/locical operation, one test with conditional execution, one update of condition codes by choice, one move of the result to a selectable register, one of 5 different shift operations.

That means that one instruction in the most extreme cases can do the work of a page of PIC instructions and still be simpler to model in your mind.

Steve Conner

Sun Aug 02 2009, 12:15PM

Registered Member #30 Joined: Fri Feb 03 2006, 10:52AM
Location: Glasgow, Scotland
Posts: 6706

Well, I take the view that an instruction is its mnemonic in assembler. So for instance, NOP is one instruction, even if the processor has 10 different opcodes that all do nothing. And I'd class all the different subtract instructions that you used as different instructions, because they have different mnemonics.

But I can think of exceptions, for instance I used to use an Analog Devices DSP with a "divide" instruction that you had to execute 16 times to get the answer. The assembler didn't do this for you, you had to write "divs" once and then "divq" 15 times in your program. The manual called divs and divq "instruction primitives".

I still argue that modern architectures are optimised for compilers. An architecture meant for programming in assembly has a hardware stack, a compiler-friendly one has no hardware stack but support for fast handling of a software one.

If you have a one-to-one relationship between high-level statements and machine instructions, then it's not a compiler you're using, it's an assembler. But some people call C a structured assembler nowadays, and when you make heavy use of assembly macros, that strains the definition too. And on the kind of bad old processors that I think Bjorn has in mind, the machine code was a (really nasty ad-hoc) high-level language with an interpreter on the chip in microcode.

For all that I dis PICs, they were the first chips that I was actually able to program in assembler. I'd previously tried 68000, but couldn't understand it. And learning to do this made me a better C programmer. I still probably couldn't manage the 68k or x86, but I wouldn't even try, I'd probably use ARM instead.

Bjørn

Sun Aug 02 2009, 02:14PM

Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058

The mnemonic is the instruction idea does not really work with ARM. Because you have 16 different conditions and the set condition codes option, that is 32 variations of every instruction. in addition to that you have the shifts that is tagged on at the end that makes a much larger difference to the behaviour of the instruction than if it sets the condition codes or not. So you think of it as a single instruction with variations, just like you think of addlw #1 and addlw #2 as the same instruction with variations on the PIC.

I was quite good at 68000 but it took some time figuring it out. When I started doing realtime graphics it became clear that only close to optimal code was fast enough since every instruction took from 4 to 130 cycles to complete on a 7 MHz CPU. Then I noticed that I only used a few fast instructions because most of the instructions were too slow and did things and with adressing modes that only compiler makers can dream up. So the trick was to use a handful of instructions that had general use and that executed in 4 cycles. That always resulted in the fastest and often most readable code. x86 is fairly similar, most of it is just there to make old code run, if you try to use it the performance goes down the drain and everything gets complicated.

When I moved over to ARM I found that only the fast simple instructions existed and that they in addition they were much more flexible so could do twice the amount of work in one quarter of the time.

The ARM does not have any hardware support for a stack, except it saves R13 in a shadow register so you can use that as a stack pointer when handling interrupts and such. It makes it very friendly since there is no special stack instructions or stack operations by default, what you see is what you get. Obviously ARM Ltd had to make a compressed version of the instruction set and mess it all up by adding push and pop instructions just to save some code space.

Steve Conner

Sun Aug 02 2009, 09:51PM

Registered Member #30 Joined: Fri Feb 03 2006, 10:52AM
Location: Glasgow, Scotland
Posts: 6706

Hmm, I had to go and look up the ARM instruction set, but I see what you mean now. So you can tack any condition code onto the end of any instruction and make it conditional?

I thought the "IT" instruction was a nice idea, I always hated doing If-Then-Else on the PIC.

The 130 cycle thing on the 68k is what I mean by the bad old days of CISC processors. The slow instructions were slow because they were broken down by the processor's microcode into a series of simpler ones. The 4 cycle instructions weren't, they just executed directly.

RISC then took the bold step of leaving all the slow instructions out, so what was left could execute directly on hardware with one instruction per cycle and no need for microcode any more. Programming in C you don't even notice the difference, except the processor is cheaper.

Bjørn

Sun Aug 02 2009, 10:38PM

Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058

Hmm, I had to go and look up the ARM instruction set, but I see what you mean now. So you can tack any condition code onto the end of any instruction and make it conditional?

Yes, it helps a lot when you don't have to branch around one or two instructions all the time when you do some complicated logic. It is also faster on a simple CPU since it does not cause the pipeline to become invalid and have to be refilled from the destination address.

It also makes it very comfortable to check for several conditions, for example for replacing linefeed and return with space:
cmp r0,#10
cmpne r0,#13
moveq r0,#32

I thought the "IT" instruction was a nice idea, I always hated doing If-Then-Else on the PIC.

Yes, the Thumb instruction set is not bad at all. I just question how wise it is to add a whole new instruction set just to save a few bytes of FLASH.

Moderator(s): Chris Russell, Noelle, Alex, Tesladownunder, Dave Marshall, Dave Billington, Bjørn, Steve Conner, Wolfram, Kizmo, Mads Barnkob