Discussion:
itoa'd you so?
Steve
2007-09-19 20:42:25 UTC
Permalink
As many of you may be aware I have been actively job hunting these
past couple of weeks.
I've gone to many interviews now where I've been asked to do some
pretty peculiar things, such as solving logic puzzles, write random
function x etc.
But this one takes the cake, mostly because the interviewer told me he
had "seen better" but could not or would not elaborate further.

Here is the setup.

There is no standard way to convert an int into a char*, or at least
there is nothing in the standard library.

So I was asked how I would write an itoa function.
The goal was to target an embedded platform, and so clock cycles and
overhead really count.

The solution could be in either C or C++

I came up with 2 solutions which both amounted to roughly the same code.

std::string itoa(int in){
std::stringstream out;
out << in;
return(out.str());
}

and

char* itoa(int in){
std::stringstream out;
out << in;
return(out.rdbuf.c_str());
}

So my question is, what is wrong with this method (I haven't tested it
so there may be a minor syntax error, but that aside)?

Seems to me it's the most efficient way to do it in a typesafe manner,
but maybe I'm just missing something.
Can anyone see anything wrong or maybe just a better way to do an itoa?

Sincerely,
Steve

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-19 20:25:15 UTC
Permalink
Post by Steve
I came up with 2 solutions which both amounted to roughly the same code.
std::string itoa(int in){
std::stringstream out;
out << in;
return(out.str());
}
and
char* itoa(int in){
std::stringstream out;
out << in;
return(out.rdbuf.c_str());
}
So my question is, what is wrong with this method (I haven't tested it
so there may be a minor syntax error, but that aside)?
What's wrong with that method is that it's using the standard library,
which was explicitly not available. C++ kind of hides the fact that
you're essentially calling itoa() when you do out << in there, once
you go through the overloaded function that implements <<, etc.

So, try doing it again in plain C without using any standard library
functions.

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Weston Cerny
2007-09-19 21:18:22 UTC
Permalink
Post by Levi Pearson
Post by Steve
I came up with 2 solutions which both amounted to roughly the same code.
std::string itoa(int in){
std::stringstream out;
out << in;
return(out.str());
}
and
char* itoa(int in){
std::stringstream out;
out << in;
return(out.rdbuf.c_str());
}
So my question is, what is wrong with this method (I haven't tested it
so there may be a minor syntax error, but that aside)?
What's wrong with that method is that it's using the standard library,
which was explicitly not available. C++ kind of hides the fact that
you're essentially calling itoa() when you do out << in there, once
you go through the overloaded function that implements <<, etc.
So, try doing it again in plain C without using any standard library
functions.
--Levi
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
What about and I am not a C or C++ person so i'll avoid the char*
discussion and just use string to denote the operations?
string itoa(int in) {
string ret = "";
int current = in;

while (current > 0 && current %= 10)
{
ret = (char)((int)current + (int)'0') + ret;
current /= 10;
}

return ret;
}


/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Weston Cerny
2007-09-19 21:24:31 UTC
Permalink
Post by Weston Cerny
Post by Levi Pearson
Post by Steve
I came up with 2 solutions which both amounted to roughly the same code.
std::string itoa(int in){
std::stringstream out;
out << in;
return(out.str());
}
and
char* itoa(int in){
std::stringstream out;
out << in;
return(out.rdbuf.c_str());
}
So my question is, what is wrong with this method (I haven't tested it
so there may be a minor syntax error, but that aside)?
What's wrong with that method is that it's using the standard library,
which was explicitly not available. C++ kind of hides the fact that
you're essentially calling itoa() when you do out << in there, once
you go through the overloaded function that implements <<, etc.
So, try doing it again in plain C without using any standard library
functions.
--Levi
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
What about and I am not a C or C++ person so i'll avoid the char*
discussion and just use string to denote the operations?
string itoa(int in) {
string ret = "";
int current = in;
while (current > 0 && current %= 10)
{
ret = (char)((int)current + (int)'0') + ret;
current /= 10;
}
return ret;
}
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Oops i'm double using my variable and not checking my bounds correctly
for a negative number.

string itoa(int in) {
string ret = "";
int current = in;
int digit = 0;

while (current != 0 && digit = current % 10)
{
ret = (char)(digit+ (int)'0') + ret;
current /= 10;
}

return ret;
}


/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-19 21:11:07 UTC
Permalink
Post by Weston Cerny
Oops i'm double using my variable and not checking my bounds correctly
for a negative number.
string itoa(int in) {
string ret = "";
int current = in;
int digit = 0;
while (current != 0 && digit = current % 10)
{
ret = (char)(digit+ (int)'0') + ret;
current /= 10;
}
return ret;
}
I'm sure this is much closer to what they were looking for. On the
other hand, strings are not a native datatype, so you're still using
standard libraries. On many embedded systems, you may not have the
luxury of a standard string class, so you'd be stuck with arrays of
characters. This makes it a little more tricky, since you now have to
deal with the size of the array, who allocates it, etc.

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Michael L Torrie
2007-09-19 21:25:10 UTC
Permalink
Post by Steve
So I was asked how I would write an itoa function.
The goal was to target an embedded platform, and so clock cycles and
overhead really count.
The solution could be in either C or C++
In the context of a company that does embedded programming, he really
means "do it in C."
Post by Steve
I came up with 2 solutions which both amounted to roughly the same code.
std::string itoa(int in){
std::stringstream out;
out << in;
return(out.str());
^^^^^^^^^^^^^^^^^^^
nope. you can't do that.
Post by Steve
}
and
char* itoa(int in){
std::stringstream out;
out << in;
return(out.rdbuf.c_str());
^^^^^^^^^^^^^^
can't do that either, at least safely.
Post by Steve
}
So my question is, what is wrong with this method (I haven't tested it
so there may be a minor syntax error, but that aside)?
Seems to me it's the most efficient way to do it in a typesafe manner,
but maybe I'm just missing something.
Can anyone see anything wrong or maybe just a better way to do an itoa?
Sincerely,
Steve
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Michael L Torrie
2007-09-19 21:29:10 UTC
Permalink
Post by Michael L Torrie
Post by Steve
std::string itoa(int in){
std::stringstream out;
out << in;
return(out.str());
^^^^^^^^^^^^^^^^^^^
nope. you can't do that.
Nevermind. This one would probably work, if out.str() returns a new
object and not a reference to some internal object within the out object.
Post by Michael L Torrie
Post by Steve
}
and
char* itoa(int in){
std::stringstream out;
out << in;
return(out.rdbuf.c_str());
^^^^^^^^^^^^^^
can't do that either, at least safely.
This one, though, is definitely bad. By the time the caller gets the
char *, it's already pointing to invalidated stack space. Good way to
get a segfault or worse a buffer overrun that could lead to bad things.

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Steve
2007-09-19 21:30:27 UTC
Permalink
Ok now this has got me stumped what was unsafe about #2?
Taking advantage of the type safety features in the standard lib, is
what I was trying to show, and I have a lot of code running using
both. If thats unsafe, I have some patchin' to do.
Post by Michael L Torrie
Post by Steve
So I was asked how I would write an itoa function.
The goal was to target an embedded platform, and so clock cycles and
overhead really count.
The solution could be in either C or C++
In the context of a company that does embedded programming, he really
means "do it in C."
Post by Steve
I came up with 2 solutions which both amounted to roughly the same code.
std::string itoa(int in){
std::stringstream out;
out << in;
return(out.str());
^^^^^^^^^^^^^^^^^^^
nope. you can't do that.
Post by Steve
}
and
char* itoa(int in){
std::stringstream out;
out << in;
return(out.rdbuf.c_str());
^^^^^^^^^^^^^^
can't do that either, at least safely.
Post by Steve
}
So my question is, what is wrong with this method (I haven't tested it
so there may be a minor syntax error, but that aside)?
Seems to me it's the most efficient way to do it in a typesafe manner,
but maybe I'm just missing something.
Can anyone see anything wrong or maybe just a better way to do an itoa?
Sincerely,
Steve
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Steve
2007-09-19 21:33:40 UTC
Permalink
Nevermind, as I asked it you replied.
Post by Steve
Ok now this has got me stumped what was unsafe about #2?
Taking advantage of the type safety features in the standard lib, is
what I was trying to show, and I have a lot of code running using
both. If thats unsafe, I have some patchin' to do.
Post by Michael L Torrie
Post by Steve
So I was asked how I would write an itoa function.
The goal was to target an embedded platform, and so clock cycles and
overhead really count.
The solution could be in either C or C++
In the context of a company that does embedded programming, he really
means "do it in C."
Post by Steve
I came up with 2 solutions which both amounted to roughly the same code.
std::string itoa(int in){
std::stringstream out;
out << in;
return(out.str());
^^^^^^^^^^^^^^^^^^^
nope. you can't do that.
Post by Steve
}
and
char* itoa(int in){
std::stringstream out;
out << in;
return(out.rdbuf.c_str());
^^^^^^^^^^^^^^
can't do that either, at least safely.
Post by Steve
}
So my question is, what is wrong with this method (I haven't tested it
so there may be a minor syntax error, but that aside)?
Seems to me it's the most efficient way to do it in a typesafe manner,
but maybe I'm just missing something.
Can anyone see anything wrong or maybe just a better way to do an itoa?
Sincerely,
Steve
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Michael L Torrie
2007-09-19 21:37:07 UTC
Permalink
Post by Steve
Ok now this has got me stumped what was unsafe about #2?
Taking advantage of the type safety features in the standard lib, is
what I was trying to show, and I have a lot of code running using
both. If thats unsafe, I have some patchin' to do.
Well, if I understand your code correctly, you're returning a char *
pointer (likely const char *) to the internal buffer of the out
stringstream, correct? If so, then on return the caller receives a
pointer to the internal buffer of the out object which lived in the
stack frame of the itoa call, which no longer exists.
Post by Steve
Post by Steve
char* itoa(int in){
std::stringstream out;
out << in;
return(out.rdbuf.c_str());
}
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Steve
2007-09-19 21:45:15 UTC
Permalink
Yes Michael you have it correct.
I made the mistake of assuming this would be a return value (copy)
operation, but I can see now it's passing a pointer, and that pointer
is pointing to something which is now out of scope.

Thankfully I looked at some code I actually do have and it uses #1
which is safe because it's returning the value.
Post by Michael L Torrie
Post by Steve
Ok now this has got me stumped what was unsafe about #2?
Taking advantage of the type safety features in the standard lib, is
what I was trying to show, and I have a lot of code running using
both. If thats unsafe, I have some patchin' to do.
Well, if I understand your code correctly, you're returning a char *
pointer (likely const char *) to the internal buffer of the out
stringstream, correct? If so, then on return the caller receives a
pointer to the internal buffer of the out object which lived in the
stack frame of the itoa call, which no longer exists.
Post by Steve
Post by Steve
char* itoa(int in){
std::stringstream out;
out << in;
return(out.rdbuf.c_str());
}
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Bryan Sant
2007-09-19 22:23:38 UTC
Permalink
Post by Michael L Torrie
Well, if I understand your code correctly, you're returning a char *
pointer (likely const char *) to the internal buffer of the out
stringstream, correct? If so, then on return the caller receives a
pointer to the internal buffer of the out object which lived in the
stack frame of the itoa call, which no longer exists.
You're exactly right. That's why itoa in the C standard lib has you
pass your own char[] into them. They're expecting the caller to
allocate and babysit the buffer.

Oh the joys of life with manual memory management :-).

-Bryan

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-19 22:27:15 UTC
Permalink
Post by Bryan Sant
You're exactly right. That's why itoa in the C standard lib has you
pass your own char[] into them. They're expecting the caller to
allocate and babysit the buffer.
Oh the joys of life with manual memory management :-).
As Michael said, itoa is not in the real C standard library, though
some implementations do package a version of itoa in with the standard
C library functions. Thus the premise behind this whole exercise. :)

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Michael L Torrie
2007-09-19 23:10:09 UTC
Permalink
Post by Bryan Sant
Oh the joys of life with manual memory management :-)
Manual memory management is often, in my opinion, a huge plus of C and
C++. With reference-counting smart pointers and destructors, memory
management in C++ is very straight-forward, fast, and safe. Just
understanding a bit about how C++ does scoping, and you can very easily
build and destroy entire data structures all without a single leak and
without having to rely on a garbage collector. Not saying a GC is bad;
simply that it's not always necessary.

With only a few caveats, C++ programmers can easily move to programming
in Java and dealing with the GC dropping references right and left. But
it rarely works the other way. 9/10 core dumps in the BYU CS 240 class
are because of Java. It pollutes C++ programmers like nothing else!
One of the caveats for C++ programmers is the way that C++ guarantees
destruction when an object goes out of scope, especially when you rely
on the side effects. Guaranteed destruction doesn't translate directly
into Java, although the "finalize" keyword can help (if I recall
correctly). A good article on the idea is at
http://royontechnology.blogspot.com/2007/01/deterministic-destruction-of-objects-in.html

It's always fun to catch java-isms in C or C++ code, the most prominent
one is the "type var = new type" expression, which is often used where a
static declaration would be far better (in C or C++). Of course using
smart pointers (reference-counting objects that hold a pointer), the
paradigm for use is much more java-like.
Post by Bryan Sant
-Bryan
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Alex Esplin
2007-09-19 23:25:46 UTC
Permalink
Post by Michael L Torrie
Post by Bryan Sant
Oh the joys of life with manual memory management :-)
Manual memory management is often, in my opinion, a huge plus of C and
C++. With reference-counting smart pointers and destructors, memory
management in C++ is very straight-forward, fast, and safe. Just
understanding a bit about how C++ does scoping, and you can very easily
build and destroy entire data structures all without a single leak and
without having to rely on a garbage collector. Not saying a GC is bad;
simply that it's not always necessary.
This is one of the reasons why I'm starting to like Objective-C so
much. I started learning it because I figured it would be the way to
go for some stuff I want to write for Macs here at work, but the more
I learn about it, the more I see places where the little extra it does
beyond C is a very good extra.
--
Alex Esplin

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-19 23:00:59 UTC
Permalink
Post by Michael L Torrie
Manual memory management is often, in my opinion, a huge plus of C and
C++. With reference-counting smart pointers and destructors, memory
management in C++ is very straight-forward, fast, and safe. Just
understanding a bit about how C++ does scoping, and you can very easily
build and destroy entire data structures all without a single leak and
without having to rely on a garbage collector. Not saying a GC is bad;
simply that it's not always necessary.
Reference-counting smart pointers and destructors aren't really manual
memory management, though. Reference-counting is a simple automatic
garbage collection algorithm. It's got advantages, especially that
its operation is deterministic and fairly simple, but also
disadvantages such as the need to maintain reference counts, which can
actually add signficant overhead. Of course, in C++, you'd simply not
use them where that overhead would matter. You can also link in a
conservative garbage collector, if you'd like. This flexibility is a
double-edged sword, in my opinion. C++ is hugely complex and has a
myriad of ways to do things, so it's really a tool that requires deep
knowledge of the language, available libraries, and best practices to
use well.
Post by Michael L Torrie
With only a few caveats, C++ programmers can easily move to programming
in Java and dealing with the GC dropping references right and left. But
it rarely works the other way. 9/10 core dumps in the BYU CS 240 class
are because of Java. It pollutes C++ programmers like nothing else!
One of the caveats for C++ programmers is the way that C++ guarantees
destruction when an object goes out of scope, especially when you rely
on the side effects. Guaranteed destruction doesn't translate directly
into Java, although the "finalize" keyword can help (if I recall
correctly). A good article on the idea is at
http://royontechnology.blogspot.com/2007/01/deterministic-destruction-of-objects-in.html
Everyone (who wants to be a Real Programmer, at least) needs to learn
how memory allocation works sooner or later. It's not Java's fault
that it doesn't teach memory allocation. People who learn C++ as a
first language make a lot of segfaults, too. :) If you haven't learned
how memory allocation works yet, you're not yet a C++ programmer, so
you're not 'polluted', you're just a neophyte.
Post by Michael L Torrie
It's always fun to catch java-isms in C or C++ code, the most prominent
one is the "type var = new type" expression, which is often used where a
static declaration would be far better (in C or C++). Of course using
smart pointers (reference-counting objects that hold a pointer), the
paradigm for use is much more java-like.
Then you have to watch out for subtle interactions with STL containers
and such, if I remember correctly. One of the disadvantages of having
so many different ways of doing things is that not all the ways are
compatible.

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Dave Smith
2007-09-20 04:26:27 UTC
Permalink
C++ is hugely complex and has a myriad of ways to do things, so it's really a tool that requires deep knowledge of the language, available libraries, and best practices to use well.
I disagree with 2 of your 3 statements and only partially agree with the
other. Let me respectfully address them below, in the interest of an
edifying discussion. Please set me straight where I err.

First, you contend that you must have a deep knowledge of the C++
language to use it well. Let me address that point last, because I
partially agree with you on that one.

Second, you contend that you need deep knowledge of available libraries
to use C++ well. Is that not also true of all languages? Those who are
ignorant of the useful libraries in their language of choice are doomed
to reimplement them. This surely applies to all languages. Are you
perhaps inferring that C++'s standard library is so lacking that you
have to go to outside sources to make it usable? If so, I agree, but
Google can help. However, I still contend that in order to make good use
of any language, you have got to be familiar with the available
libraries. This is especially true of Java, which alone isn't much of a
tool, but when coupled with the myriad of outside libraries, it's pretty
capable.

Third, you contend that you must know C++ best practices to use it well.
Are you saying that there are languages that *don't* require developers
to employ best practices to use the them well? I think this is common to
all languages. Perhaps you are inferring that because there are so many
ways to do an individual task that it is too hard to guess the best one
without a deeper knowledge?

Now back to your first argument, that you must have a deep knowledge of
the language to use it well. I don't think this is entirely true. You
certainly have to know some things, on which topic I have two pieces of
advice:

1. You should find some well-written C++ to pattern after (which might
prove difficult because so few people seem to follow my next point).

2. You need to read a few entries in the C++ Super FAQ and a few pages
from the beginning of Josuttis's book, The C++ Standard Library (which
pages incidentally have nothing to do with the C++ Standard Library but
cover basic C++ language features).

If you do those two things, you'll be in good shape to code lots of
awesome C++.

You certainly don't have to understand some of the language's cobweb
filled corners like virtual inheritance and the unfavorable diamond or
recursive template meta-programming to be a good C++ developer.

Here are a couple other examples of why C++ is a great programming language.

1. Qt's foreach macro. The language designers did not write a sane
foreach. In fact, the one that ships with the standard library is a
steaming pile of unusable poo (unlike most poo which is not steaming and
at least somewhat usable), but Qt managed to use cool language features
to make a nifty, useful, and type-safe foreach macro that *feels* like
part of the language despite being added many years after the most
recent language revision.

2. Macro-based debugging. Using the macro system, you can remove and
insert entire chunks of code just by flipping a flag in your makefile.
In other languages (like Java and Python), you'd have to do some serious
magic to actually *remove* debug code from your software at release
time, rather than just wrap them in false conditionals, to get the same
kind of performance as C++.

3. A linker! Why don't other languages have linkers these days? It's
such a pain to kludge together jar files that you hope have all the
classes you need to ship in one bundle. I know the Mono team is working
feverishly to finally write a linker after how many years? Using a C++
linker, I can create a single binary that will run on just about every
Linux distro (even spanning kernel versions).

4. Templates. Templates are cool when used properly, and type safe. Of
course, templates can be abused, so use with caution. This is one of
those areas that requires a little bit of "deep knowledge." :)

5. Speed. For most user-interactive desktop applications, you still
can't JIT fast enough to match C++'s raw speed.

Anyway, I'll get off the C++ pedestal. Sometimes I feel like I'm still
rooting for the losing team, but for certain applications C++ is just
great, even if your last name isn't Stroustrup.

--Dave

P.S. I had to add "poo" to Thunderbird's spell checker. How poopy is
that? (I had to add poopy too)

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Bryan Sant
2007-09-20 05:59:43 UTC
Permalink
Post by Dave Smith
If you do those two things, you'll be in good shape to code lots of
awesome C++.
Thanks for the tips.
Post by Dave Smith
2. Macro-based debugging. Using the macro system, you can remove and
insert entire chunks of code just by flipping a flag in your makefile.
In other languages (like Java and Python), you'd have to do some serious
magic to actually *remove* debug code from your software at release
time, rather than just wrap them in false conditionals, to get the same
kind of performance as C++.
A blessing and a curse. Macro-based debugging has merit, but nasty
black magic macros used for non-debugging purposes can be a bear to
debug. I'd also argue that the single jump instruction that is
produced with an "if (debug) {" test one would use in a language that
doesn't support macros doesn't even register on the performance
overhead meter. This is a total non-issue as far as performance is
concerned. However, removing the code all together does reduce the
size of your binary image which is kind of nice.
Post by Dave Smith
3. A linker! Why don't other languages have linkers these days? It's
such a pain to kludge together jar files that you hope have all the
classes you need to ship in one bundle. I know the Mono team is working
feverishly to finally write a linker after how many years? Using a C++
linker, I can create a single binary that will run on just about every
Linux distro (even spanning kernel versions).
Boo. Down with static linkers ;-). Everything in java/mono is loaded
dynamically. A java or mono linker doesn't make any sense. If you
want to get the same effect, you'd just bundle everything up in a
single jar/assembly.
Post by Dave Smith
5. Speed. For most user-interactive desktop applications, you still
can't JIT fast enough to match C++'s raw speed.
Not that this isn't a legitimate bragging right of C++, but modern
desktop computers are so bloody fast, that this is less of an issue.
And just to be contrary, I'd point out that post-JIT code can in fact
match the speed of C++. The problem is that you have to incur some
latency while the JIT takes place.
Post by Dave Smith
Anyway, I'll get off the C++ pedestal. Sometimes I feel like I'm still
rooting for the losing team, but for certain applications C++ is just
great, even if your last name isn't Stroustrup.
I don't know if I would say that C++ is a losing team. Many may
prefer something else, but the facts are this:

Tons of jobs -- 8,860 on dice.com.

TIOBE programming language popularity list has C++ in spot #5. That's
above heavy hitters like perl, python, C#, ruby, and javascript.
http://www.tiobe.com/tpci.htm

-Bryan

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-20 07:04:32 UTC
Permalink
Post by Bryan Sant
A blessing and a curse. Macro-based debugging has merit, but nasty
black magic macros used for non-debugging purposes can be a bear to
debug. I'd also argue that the single jump instruction that is
produced with an "if (debug) {" test one would use in a language that
doesn't support macros doesn't even register on the performance
overhead meter. This is a total non-issue as far as performance is
concerned. However, removing the code all together does reduce the
size of your binary image which is kind of nice.
If debug is defined as a constant, the code ought to be completely
eliminated by any reasonable compiler. Why keep dead code around?
C-style preprocessing is pretty much the worst possible way to do most
of the things it does, but at least it does them in a fairly simple
and consistent manner.

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Dave Smith
2007-09-20 21:57:35 UTC
Permalink
Post by Levi Pearson
If debug is defined as a constant, the code ought to be completely
eliminated by any reasonable compiler. Why keep dead code around?
C-style preprocessing is pretty much the worst possible way to do most
of the things it does, but at least it does them in a fairly simple
and consistent manner.
I'm surprised no one has mentioned Aspect Oriented Programming for
debugging, which to me is a textbook no-brainer application. It operates
(potentially) directly on the parse tree and should be ideal for
removing/adding debugging code at build time (or even run time). Is it dead?

--Dave

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Nicholas Leippe
2007-09-20 22:29:02 UTC
Permalink
Post by Dave Smith
I'm surprised no one has mentioned Aspect Oriented Programming for
debugging, which to me is a textbook no-brainer application. It operates
(potentially) directly on the parse tree and should be ideal for
removing/adding debugging code at build time (or even run time). Is it dead?
AOP is not dead. There are many projects to add AOP to several languages.
They include at least (from my recollection when I last read up on it) Lisp,
PHP, boo, Ruby, Python, javascript, and I believe even C++ --although code
weaving in a statically compiled language is a bit more difficult. I never
looked into Java or .net/mono, but I imagine it should be possible even if
not easy.

I have been itching for a good project to try my hand at AOP with...


/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Jonathan Ellis
2007-09-20 23:39:24 UTC
Permalink
Post by Nicholas Leippe
AOP is not dead. There are many projects to add AOP to several languages.
They include at least (from my recollection when I last read up on it) Lisp,
PHP, boo, Ruby, Python, javascript, and I believe even C++ --although code
weaving in a statically compiled language is a bit more difficult. I never
looked into Java or .net/mono, but I imagine it should be possible even if
not easy.
As with many "design patterns" from lower-level languages, dynamic
languages have been doing AOP for years without bothering to call it
that. You don't need an AOP library in Python, Ruby, or Javascript.

-Jonathan

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-20 23:51:32 UTC
Permalink
Post by Jonathan Ellis
As with many "design patterns" from lower-level languages, dynamic
languages have been doing AOP for years without bothering to call it
that. You don't need an AOP library in Python, Ruby, or Javascript.
The whole point of "design patterns" is that people have been doing
them for years without the pattern being formally described. That
aside, I don't think AoP qualifies as a design pattern. It's more of
a design tool--a way to decompose your problem into parts. It's both
a way of thinking about programming and implementing those programs.
Some languages may already have the necessary features to implement
AoP, while others (such as Java) may require language extensions.

Anyway, here's a paper that talks about the usefulness of AoP in
higher-order languages: http://www.cs.brown.edu/~sk/Publications/Papers/Published/tk-ptcts-adv-ho-lang/paper.pdf

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Bryan Sant
2007-09-20 22:43:44 UTC
Permalink
Post by Dave Smith
I'm surprised no one has mentioned Aspect Oriented Programming for
debugging, which to me is a textbook no-brainer application. It operates
(potentially) directly on the parse tree and should be ideal for
removing/adding debugging code at build time (or even run time). Is it dead?
--Dave
AOP is alive and well, and quickly becoming main-stream in the java
world. Popular IoC containers such as Spring provide runtime or
compile-time weaving. It's common to see AOP used for
logging/debugging in medium to large java apps.

-Bryan

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-20 23:28:53 UTC
Permalink
Post by Dave Smith
I'm surprised no one has mentioned Aspect Oriented Programming for
debugging, which to me is a textbook no-brainer application. It
operates (potentially) directly on the parse tree and should be ideal
for removing/adding debugging code at build time (or even run
time). Is it dead?
Not dead, but since it often requires language features that aren't
provided by most mainstream languages, it also requires new compilers
and stuff. That bumps up the cost of adoption a bit.

It's interesting to note that Gregor Kiczales, inventor of AoP and
AspectJ, had earlier done a lot of work on the CLOS Meta-Object
Protocol. Unsurprisingly, you can implement AoP in Common Lisp with a
Meta-Object Protocol implementation and some macros. :)

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-20 06:56:32 UTC
Permalink
Post by Dave Smith
C++ is hugely complex and has a myriad of ways to do things, so it's really a tool that requires deep knowledge of the language, available libraries, and best practices to use well.
I disagree with 2 of your 3 statements and only partially agree with
the other. Let me respectfully address them below, in the interest of
an edifying discussion. Please set me straight where I err.
I think we agree more than you seem to think we do.
Post by Dave Smith
First, you contend that you must have a deep knowledge of the C++
language to use it well. Let me address that point last, because I
partially agree with you on that one.
I was probably not very clear with my intent here. Instead of 'use
well' I should have said something like 'get advantages over a safer
language like Java'. And there are definitely advantages to be gotten
from using C++, but they require deep and broad knowledge and come at
the cost of some safety.
Post by Dave Smith
Second, you contend that you need deep knowledge of available
libraries to use C++ well. Is that not also true of all languages?
Those who are ignorant of the useful libraries in their language of
choice are doomed to reimplement them. This surely applies to all
languages. Are you perhaps inferring that C++'s standard library is so
lacking that you have to go to outside sources to make it usable? If
so, I agree, but Google can help. However, I still contend that in
order to make good use of any language, you have got to be familiar
with the available libraries. This is especially true of Java, which
alone isn't much of a tool, but when coupled with the myriad of
outside libraries, it's pretty capable.
C++ has lots of libraries available and they're not all of the same
quality or even implementation style. There's crazy template-heavy
stuff in Boost, there's object-oriented stuff in some places, etc.
Qt, which you mention later, kind of does its own thing.

Java, on the other hand, is less flexible in style and therefore
everything follows the same general object-oriented style. Java also
comes with a lot more functionality built-in and standard. This has
advantages and disadvantages, but I think in general it's got a
learning curve that's not nearly so steep.
Post by Dave Smith
Third, you contend that you must know C++ best practices to use it
well. Are you saying that there are languages that *don't* require
developers to employ best practices to use the them well? I think this
is common to all languages. Perhaps you are inferring that because
there are so many ways to do an individual task that it is too hard to
guess the best one without a deeper knowledge?
I'm saying that if you don't know when to use auto pointers, RAII,
RTTI, all the rules about how to handle virtual functions, etc. you're
going to be at a serious disadvantage. I'm not an expert at either
C++ or Java, but Java seems to have a lot fewer of these subtleties to
learn before you can write decent code. For another comparison, look at
Stroustrup's book on C++ vs. the K&R C book. Also the C FAQ vs. the C++
FAQ, which is a rather fat book and not even available for free online
last time I checked.

So, yes, you ought to learn best practices for any language you use.
In some languages, however, they're more about codifying common sense
than mapping out a path through a minefield. :)
Post by Dave Smith
Now back to your first argument, that you must have a deep knowledge
of the language to use it well. I don't think this is entirely
true. You certainly have to know some things, on which topic I have
<advice elided>

That's great, but that will probably get you to the point where you
can write code that's on par with what the average Java Joe writes.
This is not to say that Java is bad, just that it's got a different
set of tradeoffs that necessarily limits the control you have over
your system's performance.

Basically, I compare C++ to the contractor-grade tool and Java to the
pro-sumer grade tool. Both do roughly the same thing, but the
contractor-grade tool has got the safety shields removed and has a
tendency to kick back at you if you have poor form. You can get
similar results with either as long as you've got moderate
proficiency, but the contractor-grade tool lets you take things to
another level if you've got the chops to handle it, plus you can swap
out parts when necessary. The pro-sumer grade is a one-size-fits-all
that's pretty adjustable, but only within the parameters set by the
factory.

Of course, if you give the novice the contractor tool, he's likely to
take a finger off, or worse. :)
Post by Dave Smith
You certainly don't have to understand some of the language's cobweb
filled corners like virtual inheritance and the unfavorable diamond or
recursive template meta-programming to be a good C++ developer.
Well, maybe you don't have to, but you definitely need to have someone
who does handy to review your code then. Or at least know what areas
of the language to avoid.
Post by Dave Smith
Here are a couple other examples of why C++ is a great programming language.
1. Qt's foreach macro. The language designers did not write a sane
foreach. In fact, the one that ships with the standard library is a
steaming pile of unusable poo (unlike most poo which is not steaming
and at least somewhat usable), but Qt managed to use cool language
features to make a nifty, useful, and type-safe foreach macro that
*feels* like part of the language despite being added many years after
the most recent language revision.
Language extension is one of the cool features of C++, but it's also a potential source of pitfalls for novices. The Qt macro you speak of is probably done reasonably well, but textual substitution macros are an evil thing and, if written or used improperly, are liable to cause enormous amounts of pain and suffering. Also, overloading operators can hide expensive or unorthodox operations from people who aren't expecting them. Again, a trade-off.
Post by Dave Smith
2. Macro-based debugging. Using the macro system, you can remove and
insert entire chunks of code just by flipping a flag in your
makefile. In other languages (like Java and Python), you'd have to do
some serious magic to actually *remove* debug code from your software
at release time, rather than just wrap them in false conditionals, to
get the same kind of performance as C++.
Any compiler worth its bits ought to be able to eliminate unreachable code, so I don't think this is necessarily as big a win as you say, though macros are nice. If only there was a sane macro system available that operated on the parse tree rather than the text...
Post by Dave Smith
3. A linker! Why don't other languages have linkers these days? It's
such a pain to kludge together jar files that you hope have all the
classes you need to ship in one bundle. I know the Mono team is
working feverishly to finally write a linker after how many years?
Using a C++ linker, I can create a single binary that will run on just
about every Linux distro (even spanning kernel versions).
Yeah, this is nice, but not a big deal. You can link Java code with
gcj, if you want.
Post by Dave Smith
4. Templates. Templates are cool when used properly, and type safe. Of
course, templates can be abused, so use with caution. This is one of
those areas that requires a little bit of "deep knowledge." :)
I love the idea of generic programming. The implementation of it in
C++ is lacking, though. See an ML derivative for sanity here. C++ is
the way it is for efficiency reasons, though. Efficiency reasons and
compatibility with C are why its abstractions are all full of holes,
it has fewer safety guarantees than other modern languages, and
especially why it continues to have so many adherents. :) I'm glad
that a language like C++ exists, but I will continue to believe that
it's a tool best reserved for those who really need the extra control
it gives you over a language like Java.
Post by Dave Smith
5. Speed. For most user-interactive desktop applications, you still
can't JIT fast enough to match C++'s raw speed.
Most user-interactive desktop applications aren't really dealing with
'raw speed', whatever that means. I'm not really sure where the
slowness of modern desktop applications comes from, but I'd point the
finger at the mountains of code beneath the surface before I pointed
at the language implementations at the foundation. When it comes to
raw calculating speed, Java and C# are actually pretty fast these
days. It's when you need to do custom memory handling for good
performance that C++ wins.
Post by Dave Smith
Anyway, I'll get off the C++ pedestal. Sometimes I feel like I'm still
rooting for the losing team, but for certain applications C++ is just
great, even if your last name isn't Stroustrup.
I agree, C++ is a great tool for some applications. I haven't needed
to write any of those, though, so I stick with C, which is at least
fairly simple and ubiquitous, and sane languages. ;)

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Nicholas Leippe
2007-09-20 15:21:45 UTC
Permalink
Post by Levi Pearson
If only there was a sane macro system available that operated on
the parse tree rather than the text...
If only ;)

(/me points out Levi's allusion to Lisp for anyone that missed it)


/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-20 15:29:51 UTC
Permalink
Post by Nicholas Leippe
Post by Levi Pearson
If only there was a sane macro system available that operated on
the parse tree rather than the text...
If only ;)
(/me points out Levi's allusion to Lisp for anyone that missed it)
Well, Lisp isn't the ONLY language that does macros this way. :)

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Dave Smith
2007-09-20 21:54:10 UTC
Permalink
Post by Levi Pearson
Most user-interactive desktop applications aren't really dealing with
'raw speed', whatever that means. I'm not really sure where the
slowness of modern desktop applications comes from, but I'd point the
finger at the mountains of code beneath the surface before I pointed
at the language implementations at the foundation. When it comes to
raw calculating speed, Java and C# are actually pretty fast these
days. It's when you need to do custom memory handling for good
performance that C++ wins.
I agree with all your points, but wanted to comment on the one above.

We might get a glimpse into who wins the Java vs. C++ desktop
application performance battle once Qt Jambi is more mature. That'll
eliminate the "mountains of code" and provide an apples-to-apples
comparison of languages I hope.

--Dave

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Derek Davis
2007-09-26 19:27:04 UTC
Permalink
Post by Dave Smith
2. You need to read a few entries in the C++ Super FAQ and a few pages
from the beginning of Josuttis's book, The C++ Standard Library (which
pages incidentally have nothing to do with the C++ Standard Library but
cover basic C++ language features).
Do you have a link to the super faq?

Derek

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Dave Smith
2007-09-26 19:30:50 UTC
Permalink
Post by Derek Davis
Do you have a link to the super faq?
I stand mistaken. It's called the C++ FAQ LITE, and this is the URL:

http://www.parashift.com/c++-faq-lite/

Lots of good stuff there, like should you use the "this" pointer in a
constructor? :)

http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.7

--Dave

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Derek Davis
2007-09-26 19:54:34 UTC
Permalink
Post by Dave Smith
http://www.parashift.com/c++-faq-lite/
Lots of good stuff there, like should you use the "this" pointer in a
constructor? :)
http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.7
--Dave
Awesome, thanks. I guess that would explain why all my searches for
the c++ superfaq were fruitless. At least, lacking in the fruit that
I was looking for.

Derek

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/

Bryan Sant
2007-09-20 05:20:35 UTC
Permalink
Post by Michael L Torrie
Manual memory management is often, in my opinion, a huge plus of C and
C++. With reference-counting smart pointers and destructors, memory
management in C++ is very straight-forward, fast, and safe. Just
I agree. Using smart pointers makes a big difference.
Post by Michael L Torrie
understanding a bit about how C++ does scoping, and you can very easily
build and destroy entire data structures all without a single leak and
without having to rely on a garbage collector. Not saying a GC is bad;
simply that it's not always necessary.
I also agree. A proper understanding of memory management does your body good.
Post by Michael L Torrie
With only a few caveats, C++ programmers can easily move to programming
in Java and dealing with the GC dropping references right and left. But
it rarely works the other way. 9/10 core dumps in the BYU CS 240 class
are because of Java. It pollutes C++ programmers like nothing else!
The problem isn't a high-level language. The problem is dumb BYU CS
students ;-).
Post by Michael L Torrie
One of the caveats for C++ programmers is the way that C++ guarantees
destruction when an object goes out of scope, especially when you rely
on the side effects. Guaranteed destruction doesn't translate directly
into Java, although the "finalize" keyword can help (if I recall
correctly). A good article on the idea is at
http://royontechnology.blogspot.com/2007/01/deterministic-destruction-of-objects-in.html
Finalize actually has its warts. It does provide somewhat of a
nondeterministic destructor, but using it delays your object from
being collected for an extra generation. Not a huge issue, but if you
want better heap control and deterministic clean-up of resources, it's
best to use a standard method that you manually call.

-Bryan

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-19 21:26:06 UTC
Permalink
Post by Steve
Ok now this has got me stumped what was unsafe about #2?
Taking advantage of the type safety features in the standard lib, is
what I was trying to show, and I have a lot of code running using
both. If thats unsafe, I have some patchin' to do.
OK, I reread your description of the problem, and I guess they didn't
forbid using standard libraries as I thought they did. Using C++
streams is still serious overkill. Anyway, your emphasis here on
'type safety' is misplaced. The type of your function is int ->
string, and solving the problem doesn't require any typecasting at
all. As far as type theory goes, you're setting up a map (i.e., a
function) from integers to their corresponding string representations.
Maybe you could explain better what your type safety concerns were?

Clearly, they wanted to see if you knew how to algorithmically
translate an integer to a string. You totally punted on that problem.
See several other examples in the thread for what I imagine they were
looking for.

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Steve
2007-09-19 22:06:40 UTC
Permalink
So actually it looks like I misunderstood the original question.
Rather than some fundamental flaw in what I'm doing.

#2 was a serious screw up on my part I should have known better.

But I think #1 was still an elegant answer, although maybe in an
embedded situation it was less than optimal.

I'll keep searching for a better solution, but my thinking was < code
== less things to go wrong.
Post by Levi Pearson
Post by Steve
Ok now this has got me stumped what was unsafe about #2?
Taking advantage of the type safety features in the standard lib, is
what I was trying to show, and I have a lot of code running using
both. If thats unsafe, I have some patchin' to do.
OK, I reread your description of the problem, and I guess they didn't
forbid using standard libraries as I thought they did. Using C++
streams is still serious overkill. Anyway, your emphasis here on
'type safety' is misplaced. The type of your function is int ->
string, and solving the problem doesn't require any typecasting at
all. As far as type theory goes, you're setting up a map (i.e., a
function) from integers to their corresponding string representations.
Maybe you could explain better what your type safety concerns were?
Clearly, they wanted to see if you knew how to algorithmically
translate an integer to a string. You totally punted on that problem.
See several other examples in the thread for what I imagine they were
looking for.
--Levi
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-19 21:48:52 UTC
Permalink
Post by Steve
But I think #1 was still an elegant answer, although maybe in an
embedded situation it was less than optimal.
I'll keep searching for a better solution, but my thinking was < code
== less things to go wrong.
Here's an example for you, from http://www.jb.man.ac.uk/~slowe/cpp/itoa.html

char* itoa( int value, char* result, int base ) {
// check that the base if valid
if (base < 2 || base > 16) { *result = 0; return result; }

char* out = result;
int quotient = value;

do {
*out = "0123456789abcdef"[ std::abs( quotient % base ) ];
++out;
quotient /= base;
} while ( quotient );

// Only apply negative sign for base 10
if ( value < 0 && base == 10) *out++ = '-';
std::reverse( result, out );
*out = 0;

return result;
}

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Bryan Sant
2007-09-19 22:30:54 UTC
Permalink
Post by Levi Pearson
char* itoa( int value, char* result, int base ) {
// check that the base if valid
if (base < 2 || base > 16) { *result = 0; return result; }
char* out = result;
int quotient = value;
do {
*out = "0123456789abcdef"[ std::abs( quotient % base ) ];
++out;
quotient /= base;
} while ( quotient );
// Only apply negative sign for base 10
if ( value < 0 && base == 10) *out++ = '-';
std::reverse( result, out );
*out = 0;
return result;
}
This only supports a base of up to 16. Make me wonder what the limit
is for the actual libc runtime's itoa implementation. We use a base
59 number in some software I use here at work. Why 59 and not 64?
Because we've removed the vowels so it's impossible for swear words to
be generated :-).

-Bryan

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Michael L Torrie
2007-09-19 22:37:47 UTC
Permalink
Post by Bryan Sant
This only supports a base of up to 16. Make me wonder what the limit
is for the actual libc runtime's itoa implementation. We use a base
59 number in some software I use here at work. Why 59 and not 64?
Because we've removed the vowels so it's impossible for swear words to
be generated :-).
itoa is not a standard c library routine, as far as I can tell. It
exists in a non-standard way in some libc's, but it's certainly not
portable.



/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Dave Smith
2007-09-19 21:42:14 UTC
Permalink
Post by Steve
So my question is, what is wrong with this method (I haven't tested it
so there may be a minor syntax error, but that aside)?
If clock cycles really count, the last thing you want to do is invoke
the pains of the C++ streaming crap. You maybe could have done something
like this on a lean system:

void itoa( int i, char *buf )
{
sprintf( buf, "%d", i );
}

Some purists would say this is begging for a buffer overflow, but I beg
to differ, since you can't sprintf() an int malformed enough to get
arbitrary code execution, even though you could at least break the
program by writing over the end of buf. A safer bet would be to require
the user to pass a buffer size and use snprintf(). Whatever the case,
it's probably going to be a lot faster than the C++ stream API.

And for an even leaner system I would have done this (yes, I have worked
on systems recently where there is not enough memory even for sprintf()):

// An integer-based pow10()
int pow10i( unsigned int p )
{
int ret = 1;
while(p--)
ret *= 10;
return ret;
}

void itoa( int value, char *buf )
{
if( value == 0 )
{
buf[0] = '0';
buf[1] = '\0';
}
else
{
int i = value;
int is_negative = ( i < 0 ) ? 1 : 0;
if( is_negative )
{
buf[0] = '-';
i = -i;
}

int len = (int)log10( i ) + 1;

for( int pos=len-1; pos>=0; pos-- )
{
const int power = pow10i( pos );
int val = ( i - ( i % power ) ) / power;

i -= ( val * power );
int index = is_negative ? len-pos : len-pos-1;
buf[index] = '0' + val;
}

int index = is_negative ? len+1 : len;
buf[index] = 0;
}
}

Please don't make fun of my code -- I wrote it in just a couple minutes,
and I'm ashamed of the floating point math it does. :) Turns out the
sprintf() version is about twice as fast, but my "raw" version would run
on even the leanest of systems with only a small amount of code space.

Bring on the flames about crappy C code... :)

Also, the best answer to this question would have been: Search google to
find a fast, robust, and most importantly tested itoa() implementation
(which I did not do -- this was all original coding, for better or worse).

--Dave


Oh, and if you fancy yourself a good embedded C/C++ coder, why don't you
send me your resume. My company is paying top dollar these days for good
embedded developers. We're also looking for Qt coders as well as Linux
C++ and Python developers. And no, you will probably NOT be asked to
implement itoa() in an interview, but you may be asked to implement
atoi(), which is quite a bit easier. ;)

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Dave Smith
2007-09-19 21:52:59 UTC
Permalink
Okay, here's one that is MUCH faster which I did after reading Weston's
(better) code. It's still not quite as fast as the sprintf() version,
but much closer (about 25% slower instead of twice as slow).

int pow10i( unsigned int p )
{
int ret = 1;
while(p--)
ret *= 10;
return ret;
}

void itoa( int value, char *buf )
{
if( value == 0 )
{
buf[0] = '0';
buf[1] = '\0';
}
else
{
int i = value;

if( value < 0 )
{
buf[0] = '-';
i = -i;
}

int len = (int)log10( i ) + 1;

for( int pos=len-1; pos>=0; pos-- )
{
int digit = i % 10;
int index = value < 0 ? pos+1 : pos;
buf[index] = '0' + digit;
i /= 10;
}

int index = value < 0 ? len+1 : len;
buf[index] = '\0';
}
}


/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Steve
2007-09-20 03:32:37 UTC
Permalink
Hey Dave I thought I did send you my resume.
Post by Dave Smith
Okay, here's one that is MUCH faster which I did after reading Weston's
(better) code. It's still not quite as fast as the sprintf() version,
but much closer (about 25% slower instead of twice as slow).
int pow10i( unsigned int p )
{
int ret = 1;
while(p--)
ret *= 10;
return ret;
}
void itoa( int value, char *buf )
{
if( value == 0 )
{
buf[0] = '0';
buf[1] = '\0';
}
else
{
int i = value;
if( value < 0 )
{
buf[0] = '-';
i = -i;
}
int len = (int)log10( i ) + 1;
for( int pos=len-1; pos>=0; pos-- )
{
int digit = i % 10;
int index = value < 0 ? pos+1 : pos;
buf[index] = '0' + digit;
i /= 10;
}
int index = value < 0 ? len+1 : len;
buf[index] = '\0';
}
}
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Dave Smith
2007-09-20 04:29:18 UTC
Permalink
Post by Dave Smith
Okay, here's one that is MUCH faster which I did after reading
Weston's (better) code. It's still not quite as fast as the sprintf()
version, but much closer (about 25% slower instead of twice as slow).
So now I'm curious. What is sprintf() doing that is so much faster. I
mean, it has to parse a format string in addition to doing all the itoa
magic. Anyone care to shed some light? Is my algorithm flawed by design
(like most of my code)?

Off to read the sprintf() code...

--Dave

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Derek Davis
2007-09-20 14:53:47 UTC
Permalink
Post by Dave Smith
So now I'm curious. What is sprintf() doing that is so much faster. I
mean, it has to parse a format string in addition to doing all the itoa
magic. Anyone care to shed some light? Is my algorithm flawed by design
(like most of my code)?
Off to read the sprintf() code...
--Dave
Why not. I'll post my answer just since it came out faster than
sprintf. Well, when I pass in a buffer, rather than allocating it in
the function. Don't know about the speed comparison when allocating.
Not that this answers your question...

------------------------------------------------------

/* This space reserved for comments. */
char * itoa(int num)
{
int tmp = num;
int digits = 0;
while(tmp)
{
tmp/=10;
digits++;
}
if(num < 0)
digits++;
char *res = new char[digits+1];
res[digits--] = 0;
if(num < 0)
{
res[0] = '-';
tmp = num * -1;
}
else
tmp = num;
while(tmp)
{
res[digits--] = '0' + tmp%10;
tmp/=10;
}
return res;
}

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Dave Smith
2007-09-20 19:41:36 UTC
Permalink
Post by Derek Davis
Why not. I'll post my answer just since it came out faster than
sprintf. Well, when I pass in a buffer, rather than allocating it in
the function. Don't know about the speed comparison when allocating.
Not that this answers your question...
Yes, it turns out that using log10() is the slow point. Floating point
math must be a lot slower than I thought. I wrote and used this
integer-based log10 function function instead and now my code is about
twice as fast as sprintf(). Yay.

int log10i( unsigned int i )
{
int ret = 0;
while( i )
{
i /= 10;
ret++;
}
return ret;
}

--Dave

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Steve
2007-09-21 20:45:45 UTC
Permalink
Since the sprintf function is slowing down on the log10 due to
floating point arithmetic, I wonder if a further optimization could be
made by rewriting the log10 function in assembler to take advantage of
the floating point registers?
My ASM is way to rusty to attempt this right now but I think I'll go
through the IA-32 docs and check out the feasibility of doing this.

While it would be pointless for itoa (we would still need to convert
float to int), it might work nicely for an ftoa function :)

Sincerely,
Steve
Post by Dave Smith
Post by Derek Davis
Why not. I'll post my answer just since it came out faster than
sprintf. Well, when I pass in a buffer, rather than allocating it in
the function. Don't know about the speed comparison when allocating.
Not that this answers your question...
Yes, it turns out that using log10() is the slow point. Floating point
math must be a lot slower than I thought. I wrote and used this
integer-based log10 function function instead and now my code is about
twice as fast as sprintf(). Yay.
int log10i( unsigned int i )
{
int ret = 0;
while( i )
{
i /= 10;
ret++;
}
return ret;
}
--Dave
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-21 20:48:56 UTC
Permalink
Post by Steve
Since the sprintf function is slowing down on the log10 due to
floating point arithmetic, I wonder if a further optimization could be
made by rewriting the log10 function in assembler to take advantage of
the floating point registers?
My ASM is way to rusty to attempt this right now but I think I'll go
through the IA-32 docs and check out the feasibility of doing this.
While it would be pointless for itoa (we would still need to convert
float to int), it might work nicely for an ftoa function :)
sprintf() wasn't slowing down on fp math, his previous code was. And,
before dropping down to asm for optimization, it would probably be
worthwhile to tell gcc to output the asm it's generating to see if
it's already using whatever optimization you were thinking of.

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Dave Smith
2007-09-21 20:59:05 UTC
Permalink
Post by Steve
Since the sprintf function is slowing down on the log10 due to
floating point arithmetic, I wonder if a further optimization could be
made by rewriting the log10 function in assembler to take advantage of
the floating point registers?
I found that writing an integer-based log10 function sped up my itoa
function considerably. I don't think you could do anything faster in
ASM, especially since floating point math is not strictly required anyway.

Anyway, I revised my implementation today to not *need* a log10 function
by writing the ascii value into the *end* of the buffer, right-to-left,
least-significant to most-significant, and then it returns a pointer to
the beginning of the ascii string it creates. That way, I didn't need to
calculate the length of the string up front (or ever), and it cut my
runtime in half.

I did, however, find another special case (in addition to zero), and
that is INT_MIN. If you can figure out why INT_MIN is a special case,
you get bonus Plug Karma points. The answer lies in the result of the
following C expression:

abs( INT_MIN ); // What does this return? :)

(Note: you'll need to include <limits.h> and <stdlib.h>)

I found this special case by running 2^32 test cases for every 32 bit
integer, and 48 minutes later, it reported that it passes (ie, returns
the same as sprintf()) for all 2^32 values. Also, my code performed
about 3.3 million itoa()'s per second. Not too shabby. :)

I wonder how Java would do, and Qt's QString::number() for that matter.

--Dave

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Steve
2007-09-21 21:03:54 UTC
Permalink
/me drools
I have GOT to see this code Dave!
Post by Dave Smith
Post by Steve
Since the sprintf function is slowing down on the log10 due to
floating point arithmetic, I wonder if a further optimization could be
made by rewriting the log10 function in assembler to take advantage of
the floating point registers?
I found that writing an integer-based log10 function sped up my itoa
function considerably. I don't think you could do anything faster in
ASM, especially since floating point math is not strictly required anyway.
Anyway, I revised my implementation today to not *need* a log10 function
by writing the ascii value into the *end* of the buffer, right-to-left,
least-significant to most-significant, and then it returns a pointer to
the beginning of the ascii string it creates. That way, I didn't need to
calculate the length of the string up front (or ever), and it cut my
runtime in half.
I did, however, find another special case (in addition to zero), and
that is INT_MIN. If you can figure out why INT_MIN is a special case,
you get bonus Plug Karma points. The answer lies in the result of the
abs( INT_MIN ); // What does this return? :)
(Note: you'll need to include <limits.h> and <stdlib.h>)
I found this special case by running 2^32 test cases for every 32 bit
integer, and 48 minutes later, it reported that it passes (ie, returns
the same as sprintf()) for all 2^32 values. Also, my code performed
about 3.3 million itoa()'s per second. Not too shabby. :)
I wonder how Java would do, and Qt's QString::number() for that matter.
--Dave
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-21 21:18:59 UTC
Permalink
Post by Dave Smith
I did, however, find another special case (in addition to zero), and
that is INT_MIN. If you can figure out why INT_MIN is a special case,
you get bonus Plug Karma points. The answer lies in the result of the
abs( INT_MIN ); // What does this return? :)
(Note: you'll need to include <limits.h> and <stdlib.h>)
The answer lies in 2's complement binary representation of integers.
There aren't the same number of integers to the left and to the right
of 0; there's one more to the left. So, the absolute value of the
smallest representable integer isn't representable in the same number
of bits.

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Bryan Sant
2007-09-21 22:10:53 UTC
Permalink
Post by Levi Pearson
The answer lies in 2's complement binary representation of integers.
There aren't the same number of integers to the left and to the right
of 0; there's one more to the left. So, the absolute value of the
smallest representable integer isn't representable in the same number
of bits.
Crap. I was about to sound smart mentioning the 2's complement issue
-- albeit not as well as you did.

More fun info about it here:
http://en.wikipedia.org/wiki/Two%27s_complement#The_weird_number

-Bryan

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Dave Smith
2007-09-21 22:12:27 UTC
Permalink
Post by Levi Pearson
The answer lies in 2's complement binary representation of integers.
There aren't the same number of integers to the left and to the right
of 0; there's one more to the left. So, the absolute value of the
smallest representable integer isn't representable in the same number
of bits.
Yup, so in other words abs(INT_MIN) returns INT_MIN because it wraps
around. :)

Nice work Levi, and a good explanation.

--Dave

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Steve
2007-09-21 22:16:33 UTC
Permalink
So does this presnet a real world bug, or would the user just perform
some bounds checking prior to handing it off to the function?
Something like...

if(myval < 0 && abs(myval) == myval)

or maybe

if(myval == INT_MIN)
Post by Dave Smith
Post by Levi Pearson
The answer lies in 2's complement binary representation of integers.
There aren't the same number of integers to the left and to the right
of 0; there's one more to the left. So, the absolute value of the
smallest representable integer isn't representable in the same number
of bits.
Yup, so in other words abs(INT_MIN) returns INT_MIN because it wraps
around. :)
Nice work Levi, and a good explanation.
--Dave
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-21 22:21:56 UTC
Permalink
Post by Steve
So does this presnet a real world bug, or would the user just perform
some bounds checking prior to handing it off to the function?
Well, you could specify that your function isn't required to work on
the smallest int value representable in your int, but that makes the
function less general. Presumably, if you want to translate integers
to strings, you want to be able to do it for any integer you can
represent!

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Dave Smith
2007-09-22 07:52:37 UTC
Permalink
Post by Steve
So does this presnet a real world bug, or would the user just perform
some bounds checking prior to handing it off to the function?
Something like...
if(myval < 0 && abs(myval) == myval)
or maybe
if(myval == INT_MIN)
I solved it with the latter approach, and using a constant string
literal for INT_MIN as a #define'd macro string value. See my code
(posted later).

--Dave

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Charles Curley
2007-09-21 21:50:34 UTC
Permalink
Post by Dave Smith
Anyway, I revised my implementation today to not *need* a log10 function
by writing the ascii value into the *end* of the buffer, right-to-left,
least-significant to most-significant, and then it returns a pointer to
the beginning of the ascii string it creates. That way, I didn't need to
calculate the length of the string up front (or ever), and it cut my
runtime in half.
Ah, good. Did you finally get around to looking at the solution I
posted to this thread two days ago?
--
Charles Curley /"\ ASCII Ribbon Campaign
Looking for fine software \ / Respect for open standards
and/or writing? X No HTML/RTF in email
http://www.charlescurley.com / \ No M$ Word docs in email

Key fingerprint = CE5C 6645 A45A 64E4 94C0 809C FFF6 4C48 4ECD DFDB
Dave Smith
2007-09-21 22:11:44 UTC
Permalink
Post by Charles Curley
Ah, good. Did you finally get around to looking at the solution I
posted to this thread two days ago?
Yup, it's pretty nice with only a few warts. It doesn't work for zero or
INT_MIN. It also crashes at random points (sometimes after 6700
invocations, sometimes after 5001, sometimes after 8105, etc) when run
in a for loop.

My version skips the strcat() at the end of itoa() in favor of just
returning a char* that points at where my itoa() put the string in the
specified buffer. The caller can then know where to find the int. It
saves a bit of time, but at the slight cost of a somewhat bulky API.


Here's my code:

#include <stdio.h>
#include <string.h>
#include <limits.h>
#include <stdlib.h>

// This is a special case. Redefine it for platforms with a different
value of INT_MIN:
#define INT_MIN_STRING "-2147483648"

// Writes the value in ascii into buf, and returns a pointer inside
// buf where the ascii string begins. The string will be null
// terminated
char* itoa( int value, char *buf, int buflen )
{
// Write the number from the end of the buffer, right-to-left:
int pos = buflen - 1;

buf[pos--] = '\0';

// Special case for zero:
if( value == 0 )
{
buf[pos--] = '0';
}
// You can't take the absolute value of INT_MIN, because it would be
too long, and
// overflow INT_MAX, and result in INT_MIN again, so it's a special
case.
else if( value == INT_MIN )
{
strcpy( buf, INT_MIN_STRING );
return buf;
}
// Done with special cases. Do the generic case:
else
{
int i = abs( value );

// Write each 10's place value, starting at the right:
do
{
buf[pos--] = '0' + (i % 10);
}
while( i /= 10 );

// Prepend a negative sign for negative values:
if( value < 0 )
{
buf[pos--] = '-';
}
}

// Return a pointer into the buf where we finished writing
// the front of the string:
return &(buf[pos+1]);
}


--Dave

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-21 22:19:00 UTC
Permalink
Post by Dave Smith
My version skips the strcat() at the end of itoa() in favor of just
returning a char* that points at where my itoa() put the string in the
specified buffer. The caller can then know where to find the int. It
saves a bit of time, but at the slight cost of a somewhat bulky API.
This is probably okay if you really need the extra performance, but it
makes the memory management awkward. You've now got to keep track of
the original buffer as well as the pointer to where the string begins
within it. If you don't realize this and call free on the point where
the string begins, you're likely to have trouble.

I prefer the K&R approach of building the string backwards and
reversing it afterward.

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Charles Curley
2007-09-21 23:32:27 UTC
Permalink
Post by Dave Smith
Post by Charles Curley
Ah, good. Did you finally get around to looking at the solution I
posted to this thread two days ago?
Yup, it's pretty nice with only a few warts. It doesn't work for zero or
INT_MIN.
Picky, picky.

OK, 0 is a likely value in an embedded context, so I added code for
that. I have two return statements, which would save a branch
instruction on some compilers, but maybe not on others.

I didn't handle INT_MIN because it's a boundary condition, and not a
likely value. If INT_MIN is likely, the caller should be using longs,
and hope they are larger than ints (not required by ANSI).

Also, I'm not wild about your solution because it isn't portable to
different int sizes, as you noted. A good embedded library function
should be as portable as possible, but that comes after speed and
compactness, and your string and code will take up valuable ROM space.

--------------------------------------------------

char *itoa (int value, char *buffer)
{
int i = 0;

/* I have no way of knowing the size of the incoming buffer, so I
* have to build in a buffer, then copy into the given one. strcat
* is likely more efficient and faster than strncat, and probably
* coded in assembler. */
char tmp[SIZE+1];

tmp[SIZE] = 0;

if (value) {

/* For negative numbers, set the sign. Having done that, do
* the conversion as a positive number. */
if (value < 0) {
strcat (buffer, "-");
value = 0 - value;
}

while (value) {
i++;

/* Since we are starting at the right end of the string, we
* fill our temporary buffer from the right. This is likely
* another trick the examiner wanted. */
tmp[SIZE-i] = digit (value%base);
value = value/base;
}
strcat (buffer, &tmp[SIZE-i]);
return; /* Unstructured! Unstructured! */
} else {
/* Handle the case of 0. */
strcat (buffer, "0");
}
}
--------------------------------------------------
Post by Dave Smith
It also crashes at random points (sometimes after 6700
invocations, sometimes after 5001, sometimes after 8105, etc) when run
in a for loop.
Odd. Any thoughts on why? I haven't tried it yet. I don't see any
likely buffer overflows. Maybe padding SIZE would help??

What did you used for base? If that's large enough, then it could
cause an overflow in digit. The largest safe value for base is going
to be compiler and and processor dependent, but probably not very
useful.
Post by Dave Smith
My version skips the strcat() at the end of itoa() in favor of just
returning a char* that points at where my itoa() put the string in the
specified buffer. The caller can then know where to find the int. It
saves a bit of time, but at the slight cost of a somewhat bulky API.
Bulky and confusing. For one thing, the caller has to know how large
to make the buffer, so has to know something about the processor,
which reduces portability. Portability is not high on the list of
features for embedded code, but still there. And it is higher for
library functions.

A cleaner way to do it would be to malloc a buffer, fill and return
that. But that has two problems: the caller has then to free it, and
likely will have to strcat into a string already under construction.

I don't believe there is a good solution here.
It's a pretty good solution. A few nitpicks. Some of those are based
on quirks of oddball compilers and processors I've met....
Post by Dave Smith
#include <stdio.h>
#include <string.h>
#include <limits.h>
#include <stdlib.h>
// This is a special case. Redefine it for platforms with a different
#define INT_MIN_STRING "-2147483648"
// Writes the value in ascii into buf, and returns a pointer inside
// buf where the ascii string begins. The string will be null
// terminated
char* itoa( int value, char *buf, int buflen )
{
int pos = buflen - 1;
buf[pos--] = '\0';
if( value == 0 )
Some 16 bit compilers are stupid enough to compile the comparison
instruction instead of using a TST instruction for this (assuming the
processor has a TST instruction), so I did this as

if (value)
Post by Dave Smith
{
buf[pos--] = '0';
I'd put a return here. I know, unstructured. This is embedded code; we
bend the rules; Henry Kissinger would admire it.
Post by Dave Smith
}
// You can't take the absolute value of INT_MIN, because it would be
too long, and
// overflow INT_MAX, and result in INT_MIN again, so it's a special
case.
else if( value == INT_MIN )
{
strcpy( buf, INT_MIN_STRING );
return buf;
}
else
{
int i = abs( value );
I did the abs with a subtraction instruction because I don't know if
the abs function is declared inline. If it isn't, I've saved a JSR
call and return at run time.
Post by Dave Smith
do
{
buf[pos--] = '0' + (i % 10);
}
while( i /= 10 );
if( value < 0 )
{
buf[pos--] = '-';
}
}
// Return a pointer into the buf where we finished writing
return &(buf[pos+1]);
}
--Dave
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
--
Charles Curley /"\ ASCII Ribbon Campaign
Looking for fine software \ / Respect for open standards
and/or writing? X No HTML/RTF in email
http://www.charlescurley.com / \ No M$ Word docs in email

Key fingerprint = CE5C 6645 A45A 64E4 94C0 809C FFF6 4C48 4ECD DFDB
Levi Pearson
2007-09-22 00:07:23 UTC
Permalink
Post by Charles Curley
Picky, picky.
OK, 0 is a likely value in an embedded context, so I added code for
that. I have two return statements, which would save a branch
instruction on some compilers, but maybe not on others.
I didn't handle INT_MIN because it's a boundary condition, and not a
likely value. If INT_MIN is likely, the caller should be using longs,
and hope they are larger than ints (not required by ANSI).
So, you're punting on a known corner case because it's unlikely?
C'mon, you know that's wrong. Besides, you can make it work without
special-casing it.

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Charles Curley
2007-09-22 00:37:45 UTC
Permalink
Post by Charles Curley
Ah, good. Did you finally get around to looking at the solution I
posted to this thread two days ago?
It also crashes at random points (sometimes after 6700 invocations,
sometimes after 5001, sometimes after 8105, etc) when run in a for
loop.
With the version of the itoa function I posted earlier, I'm not seeing
it crash. I re-wrote the test frame as follows:

--------------------------------------------------

main (int argc, char *argv[])
{
char buffer[23];
char test[23];
int i;
int start = -1;
int stop = 10000000;

/* buffer[0] = 0; /* Force buffer to hold a string of
* length 0. Likely another trick the
* examiner wanted. */

base = 10;

printf ("%s, a program to exercise a subroutine to convert an int to a string.\n",
argv[0]);
printf ("Going from %i to %i.\n", start, stop);

/* printf ("0x%x converts to a string as %s.\n", test, itoa (test, buffer)); */

for (i=start; i != stop ; i++) {
/* printf ("i = %i\n", i); */
buffer[0] = 0;
sprintf (test, "%i", i);
itoa (i, buffer);
if (strcmp (buffer, test)) {
printf ("Output differs from expected. Test: %s. Buffer: %s\n",
test, buffer);
}
}
}
--------------------------------------------------

(I think it is safe to assume that sprintf works correctly....)

and got:

--------------------------------------------------
[***@dragon itoa]$ time ./itoa
./itoa, a program to exercise a subroutine to convert an int to a string.
Going from -1 to 10000000.

real 0m6.341s
user 0m6.241s
sys 0m0.013s
[***@dragon itoa]$
--------------------------------------------------

on a processor like so:

--------------------------------------------------
model name : Intel(R) Pentium(R) M processor 1.60GHz
stepping : 6
cpu MHz : 1598.674
cache size : 2048 KB
--------------------------------------------------

Have we beaten this thing into the ground yet?
--
Charles Curley /"\ ASCII Ribbon Campaign
Looking for fine software \ / Respect for open standards
and/or writing? X No HTML/RTF in email
http://www.charlescurley.com / \ No M$ Word docs in email

Key fingerprint = CE5C 6645 A45A 64E4 94C0 809C FFF6 4C48 4ECD DFDB
Levi Pearson
2007-09-22 02:37:11 UTC
Permalink
Post by Charles Curley
Have we beaten this thing into the ground yet?
Not quite yet. Try this one:

void itoa(int n, char s[]) {
int i, sign;
sign = n;

i = 0;
do {
s[i++] = "0123456789"[abs(n % 10)];
} while ( n /= 10 );
if (sign < 0)
s[i++] = '-';

s[i] = '\0';
reverse(s);
}

void reverse(char s[]) {
int c, i, j;
for ( i = 0, j = strlen(s)-1; i < j; i++, j--) {
c = s[i];
s[i] = s[j];
s[j] = c;
}
}

This is a modification of someone's modification to the version in the
K&R C book. You'll find it works correctly for all ints, doesn't
behave badly when you pass it the same buffer several times without
clearing it first, is nice and compact and elegant, and is at least as
fast as yours. Oh, and it doesn't rely on casting a char to an int,
so it's "type safe". ;)

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Charles Curley
2007-09-22 04:45:22 UTC
Permalink
Post by Charles Curley
Have we beaten this thing into the ground yet?
I like it. A few points.

* It is base ten only. My solution is much more general. Need base 36?
Recall that base 10 was not specified in the original problem. But
your method can be adapted at the cost of:

* expanding the lookup table,

* dividing by a variable instead of an inline constant, usually
slower.

Also, your lookup table is probably faster than my calculation.

* You have a call to abs inside your main loop. I think I see why, but
would you safely speed things up by moving it outside?

* Your reverse function is nice, but probably slower than strcat
because it uses an intermediate variable. But I think you can speed
it up considerably by only running for (strlen (s))/2 iterations. If
so, it probably would be better than my temporary buffer.

If the compiler will optimize the /2 into a right shift, you get rid
of a divide instruction. (Or, on most 16 bit processors, a divide
function.) Or do it yourself with (strlen (s))>>.

* I replaced my while loop with a do ... while loop, and found (to no
great surprise) that it is slower. I didn't check, but I doubt it
saved any space because of the way do ... while loops usually
compile. That did let me get rid of the 0 special case, making
things more elegant. The same might apply here.

I made a few other optimizations in my version.

* I now initialize i to SIZE and decrement it. This gets rid of a
subtraction instruction inside the main loop.

* I declared some variables to be register variables. That helps on
Intel architecture, which is so pathetically register poor. It might
or might not help on register rich processors because the compiler
might do it for you.

* I declared the function digit inline. That helped. But it would make
for a smaller code size only if digit isn't used elsewhere. It could
be a useful routine for special cases like a hex dump.

--------------------------------------------------

/* We depend on a pure ASCII environment here. */
inline char digit (int d)
{
register char ret = '0' + d;

/* For bases greater than 10, skip from '9' to 'A' in the ASCII
* table. This may have been a trick your examiner was looking
* for. */
if (ret > '9') {
ret += 7;
}
return (ret);
}

char *itoa (int value, char *buffer)
{
register int i = SIZE;

/* I have no way of knowing the size of the incoming buffer, so I
* have to build in a buffer, then copy into the given one. strcat
* is likely more efficient and faster than strncat, and probably
* coded in assembler. */
char tmp[SIZE+1];

tmp[SIZE] = 0;

/* For negative numbers, set the sign. Having done that, do
* the conversion as a positive number. */
if (value < 0) {
strcat (buffer, "-");
value = 0 - value;
}

do {
/* Since we are starting at the right end of the string, we
* fill our temporary buffer from the right. This is likely
* another trick the examiner wanted. */
tmp[--i] = digit (value%base);
value = value/base;
} while (value);

strcat (buffer, &tmp[i]);
}
--------------------------------------------------
--
Charles Curley /"\ ASCII Ribbon Campaign
Looking for fine software \ / Respect for open standards
and/or writing? X No HTML/RTF in email
http://www.charlescurley.com / \ No M$ Word docs in email

Key fingerprint = CE5C 6645 A45A 64E4 94C0 809C FFF6 4C48 4ECD DFDB
Steve
2007-09-22 04:54:09 UTC
Permalink
How much function call overhead is this incurring?
Would it run faster if you took the digit function and just embedded
it directly into the itoa function?
Post by Charles Curley
Post by Charles Curley
Have we beaten this thing into the ground yet?
I like it. A few points.
* It is base ten only. My solution is much more general. Need base 36?
Recall that base 10 was not specified in the original problem. But
* expanding the lookup table,
* dividing by a variable instead of an inline constant, usually
slower.
Also, your lookup table is probably faster than my calculation.
* You have a call to abs inside your main loop. I think I see why, but
would you safely speed things up by moving it outside?
* Your reverse function is nice, but probably slower than strcat
because it uses an intermediate variable. But I think you can speed
it up considerably by only running for (strlen (s))/2 iterations. If
so, it probably would be better than my temporary buffer.
If the compiler will optimize the /2 into a right shift, you get rid
of a divide instruction. (Or, on most 16 bit processors, a divide
function.) Or do it yourself with (strlen (s))>>.
* I replaced my while loop with a do ... while loop, and found (to no
great surprise) that it is slower. I didn't check, but I doubt it
saved any space because of the way do ... while loops usually
compile. That did let me get rid of the 0 special case, making
things more elegant. The same might apply here.
I made a few other optimizations in my version.
* I now initialize i to SIZE and decrement it. This gets rid of a
subtraction instruction inside the main loop.
* I declared some variables to be register variables. That helps on
Intel architecture, which is so pathetically register poor. It might
or might not help on register rich processors because the compiler
might do it for you.
* I declared the function digit inline. That helped. But it would make
for a smaller code size only if digit isn't used elsewhere. It could
be a useful routine for special cases like a hex dump.
--------------------------------------------------
/* We depend on a pure ASCII environment here. */
inline char digit (int d)
{
register char ret = '0' + d;
/* For bases greater than 10, skip from '9' to 'A' in the ASCII
* table. This may have been a trick your examiner was looking
* for. */
if (ret > '9') {
ret += 7;
}
return (ret);
}
char *itoa (int value, char *buffer)
{
register int i = SIZE;
/* I have no way of knowing the size of the incoming buffer, so I
* have to build in a buffer, then copy into the given one. strcat
* is likely more efficient and faster than strncat, and probably
* coded in assembler. */
char tmp[SIZE+1];
tmp[SIZE] = 0;
/* For negative numbers, set the sign. Having done that, do
* the conversion as a positive number. */
if (value < 0) {
strcat (buffer, "-");
value = 0 - value;
}
do {
/* Since we are starting at the right end of the string, we
* fill our temporary buffer from the right. This is likely
* another trick the examiner wanted. */
tmp[--i] = digit (value%base);
value = value/base;
} while (value);
strcat (buffer, &tmp[i]);
}
--------------------------------------------------
--
Charles Curley /"\ ASCII Ribbon Campaign
Looking for fine software \ / Respect for open standards
and/or writing? X No HTML/RTF in email
http://www.charlescurley.com / \ No M$ Word docs in email
Key fingerprint = CE5C 6645 A45A 64E4 94C0 809C FFF6 4C48 4ECD DFDB
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Michael L Torrie
2007-09-22 05:05:03 UTC
Permalink
Post by Steve
How much function call overhead is this incurring?
Would it run faster if you took the digit function and just embedded
it directly into the itoa function?
Practically nothing, since it's declared as "inline." The compiler *is*
embedding it directly in the itoa function.
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Steve
2007-09-22 05:13:42 UTC
Permalink
Correct me if I'm wrong, but I thought inlining was not a guarantee,
but only a suggestion.
Post by Michael L Torrie
Post by Steve
How much function call overhead is this incurring?
Would it run faster if you took the digit function and just embedded
it directly into the itoa function?
Practically nothing, since it's declared as "inline." The compiler *is*
embedding it directly in the itoa function.
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Michael L Torrie
2007-09-22 05:25:11 UTC
Permalink
Post by Steve
Correct me if I'm wrong, but I thought inlining was not a guarantee,
but only a suggestion.
You are correct, but as long as optimization isn't turned completely
off, I'd be very shocked if the compiler didn't follow the suggestion.

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-22 05:11:58 UTC
Permalink
Post by Steve
How much function call overhead is this incurring?
Would it run faster if you took the digit function and just embedded
it directly into the itoa function?
Should be 0, since he declared it to be inline. The compiler could
always ignore that, though. And may have done it already at a high
enough optimization level.

--Levi


/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-22 05:42:29 UTC
Permalink
Post by Charles Curley
I like it. A few points.
* It is base ten only. My solution is much more general. Need base 36?
Recall that base 10 was not specified in the original problem. But
* expanding the lookup table,
* dividing by a variable instead of an inline constant, usually
slower.
Your solution doesn't take 'base' as a paramter, so I can only assume
you've defined it as a constant global variable somewher else in your
program. In that case, your solution is just as fixed as mine. The
modifications to change the base for either are trivial, but require a
recompile.
Post by Charles Curley
* You have a call to abs inside your main loop. I think I see why, but
would you safely speed things up by moving it outside?
Only after the first iteration. It's there so that it handles INT_MIN
properly. It's plenty fast, so I see no reason to move it, but you
could unroll the loop one iteration and not perform the abs() inside
after that if you needed to.
Post by Charles Curley
* Your reverse function is nice, but probably slower than strcat
because it uses an intermediate variable. But I think you can speed
it up considerably by only running for (strlen (s))/2 iterations. If
so, it probably would be better than my temporary buffer.
I think you need to read the reverse function more closely. That's
precisely what it does.
Post by Charles Curley
* I replaced my while loop with a do ... while loop, and found (to no
great surprise) that it is slower. I didn't check, but I doubt it
saved any space because of the way do ... while loops usually
compile. That did let me get rid of the 0 special case, making
things more elegant. The same might apply here.
The code I presented is adapted from the K&R C book, as I said, in the
section on control flow. It was an example of where a do/while loop
would be handy. :)
Post by Charles Curley
* I declared some variables to be register variables. That helps on
Intel architecture, which is so pathetically register poor. It might
or might not help on register rich processors because the compiler
might do it for you.
I think modern compilers do a pretty good job of optimizing register
use when you ask them to, and I think they're also free to ignore a
register declaration. I'm not sure there's any real benefit; you'd
have to compare asm output to tell.
Post by Charles Curley
* I declared the function digit inline. That helped. But it would make
for a smaller code size only if digit isn't used elsewhere. It could
be a useful routine for special cases like a hex dump.
Possibly. The compiler ought to be able to inline it without your
help, though, if you ask it to.

I think your program is pretty good now, aside from the fatal flaw of
not handling MIN_INT properly. That's really not acceptable; it needs
to be correct more than it needs to be sped up. itoa is pretty much
never going to be an embedded system bottleneck, but you never know
when you're going to hit a corner case. Also, on my computer at
least (1.2 Ghz PPC), my version is still faster.

--Levi



/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Dave Smith
2007-09-22 07:59:24 UTC
Permalink
Post by Charles Curley
I like it. A few points.
So how fast are your solutions? How many itoa's can you perform in a
second? My implementation does about 3.3 million per second on a 3.0GHz
machine running RHEL3.

I guess I could test the code myself, but I'm not at work any more
today, and won't be back until Monday, or until I next connect to the
VPN. :)

--Dave

P.S. This was fun. I think the last time we had a good code optimization
conversation was when Sasha posted a challenge. Thanks Sasha! Let's do
that again some time.

P.P.S. Levi, I love the static array code where you index into a string
literal. Awesome stuff. Those K&R fellows sure were smart about their C
code. Almost as if they "wrote the book" on C... ;)

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Charles Curley
2007-09-22 13:00:53 UTC
Permalink
Post by Levi Pearson
Post by Charles Curley
I like it. A few points.
* It is base ten only. My solution is much more general. Need base 36?
Recall that base 10 was not specified in the original problem. But
* expanding the lookup table,
* dividing by a variable instead of an inline constant, usually
slower.
Your solution doesn't take 'base' as a paramter, so I can only assume
you've defined it as a constant global variable somewher else in your
program. In that case, your solution is just as fixed as mine. The
modifications to change the base for either are trivial, but require a
recompile.
You are correct that base is defined elsewhere; it is a global.

Again, this is embedded programming. A calling function can set base
before calling itoa. That's a bit more flexible than adding to your
lookup array, which will be in ROM at runtime.
Post by Levi Pearson
Post by Charles Curley
* You have a call to abs inside your main loop. I think I see why, but
would you safely speed things up by moving it outside?
Only after the first iteration. It's there so that it handles INT_MIN
properly. It's plenty fast, so I see no reason to move it, but you
could unroll the loop one iteration and not perform the abs() inside
after that if you needed to.
OK.
Post by Levi Pearson
Post by Charles Curley
* Your reverse function is nice, but probably slower than strcat
because it uses an intermediate variable. But I think you can speed
it up considerably by only running for (strlen (s))/2 iterations. If
so, it probably would be better than my temporary buffer.
I think you need to read the reverse function more closely. That's
precisely what it does.
You are right; I stand corrected.
Post by Levi Pearson
Post by Charles Curley
* I replaced my while loop with a do ... while loop, and found (to no
great surprise) that it is slower. I didn't check, but I doubt it
saved any space because of the way do ... while loops usually
compile. That did let me get rid of the 0 special case, making
things more elegant. The same might apply here.
The code I presented is adapted from the K&R C book, as I said, in the
section on control flow. It was an example of where a do/while loop
would be handy. :)
Aha, cribbing from the textbook, eh!?
Post by Levi Pearson
Post by Charles Curley
* I declared some variables to be register variables. That helps on
Intel architecture, which is so pathetically register poor. It might
or might not help on register rich processors because the compiler
might do it for you.
I think modern compilers do a pretty good job of optimizing register
use when you ask them to, and I think they're also free to ignore a
register declaration. I'm not sure there's any real benefit; you'd
have to compare asm output to tell.
You are correct that a register declaration is advisory. I agree that
one would have to look at the assembler output. Actually, what I would
do if I were writing this for a real embedded library is look at the
assembler output for several of these versions, pick one or two for
size and speed, and further hand optimize the heck out of that.
Post by Levi Pearson
Post by Charles Curley
* I declared the function digit inline. That helped. But it would make
for a smaller code size only if digit isn't used elsewhere. It could
be a useful routine for special cases like a hex dump.
Possibly. The compiler ought to be able to inline it without your
help, though, if you ask it to.
OK. That depends on the compiler, of course.
Post by Levi Pearson
I think your program is pretty good now, aside from the fatal flaw of
not handling MIN_INT properly. That's really not acceptable; it needs
to be correct more than it needs to be sped up. itoa is pretty much
never going to be an embedded system bottleneck, but you never know
when you're going to hit a corner case. Also, on my computer at
least (1.2 Ghz PPC), my version is still faster.
Fair enough. I think we're down to the point where the actual
application at hand might determine some of our trade-offs. Do we
optimize for speed or size? That depends on whether we are constrained
for ROM space, or for time. Or, Murphy help us, both. Etc.
--
Charles Curley /"\ ASCII Ribbon Campaign
Looking for fine software \ / Respect for open standards
and/or writing? X No HTML/RTF in email
http://www.charlescurley.com / \ No M$ Word docs in email

Key fingerprint = CE5C 6645 A45A 64E4 94C0 809C FFF6 4C48 4ECD DFDB
Dave Smith
2007-09-22 17:34:15 UTC
Permalink
Post by Charles Curley
You are correct that a register declaration is advisory. I agree that
one would have to look at the assembler output. Actually, what I would
do if I were writing this for a real embedded library is look at the
assembler output for several of these versions, pick one or two for
size and speed, and further hand optimize the heck out of that.
You mean you wouldn't test it to see if it's fast enough as is before
launching into a possibly very expensive assembler optimization session?
Even in the smallest of embedded systems, premature optimization is the
still the root of all evil. :) And remember, C is usually pretty
portable, whereas assembler optimizations are almost always not. And
lastly, even on the smallest of embedded systems I've worked on (32Kb
code space, 20Mhz CPU clock), I still have never had to do any hand ASM
optimization.

--Dave

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Charles Curley
2007-09-22 18:02:19 UTC
Permalink
Post by Dave Smith
Post by Charles Curley
You are correct that a register declaration is advisory. I agree that
one would have to look at the assembler output. Actually, what I would
do if I were writing this for a real embedded library is look at the
assembler output for several of these versions, pick one or two for
size and speed, and further hand optimize the heck out of that.
You mean you wouldn't test it to see if it's fast enough as is before
launching into a possibly very expensive assembler optimization session?
Even in the smallest of embedded systems, premature optimization is the
still the root of all evil. :) And remember, C is usually pretty
portable, whereas assembler optimizations are almost always not.
All valid points, of course. I should have qualified that with
something like, "If I needed further optimization...." But, as Levi
pointed out, this sort of thing is rarely a bottleneck in embedded
stuff.
Post by Dave Smith
And lastly, even on the smallest of embedded systems I've worked on
(32Kb code space, 20Mhz CPU clock), I still have never had to do any
hand ASM optimization.
Sheesh, such luxury, almost enough room to swing a cat. I've worked on
65xx single chip derivatives with as little as 4K ROM, 1K RAM (of
which only 256 bytes are available for stack), and 1 MHz clock. Come
to think of it, those are the specs of the Apple I.

I've done hand optimization for speed and ROM space. The 65xx family
does not really lend itself to C or Forth because those languages both
"think" in terms of 16 bit data (think PDP-11 assembler and you have C
pretty well). So those processors are often candidates for hand
optimization.

Then there's the RCA Cosmac 1802 series processors. <Shudder/>. The
Rockwell PPS series, with clock speeds measured in
kilohertz. <Scream/>
--
Charles Curley /"\ ASCII Ribbon Campaign
Looking for fine software \ / Respect for open standards
and/or writing? X No HTML/RTF in email
http://www.charlescurley.com / \ No M$ Word docs in email

Key fingerprint = CE5C 6645 A45A 64E4 94C0 809C FFF6 4C48 4ECD DFDB
Dave Smith
2007-09-22 19:16:07 UTC
Permalink
Post by Charles Curley
Sheesh, such luxury, almost enough room to swing a cat. I've worked on
65xx single chip derivatives with as little as 4K ROM, 1K RAM (of
which only 256 bytes are available for stack), and 1 MHz clock. Come
to think of it, those are the specs of the Apple I.
/me bows to your superior embedded fu. :)

--Dave

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Levi Pearson
2007-09-22 18:41:10 UTC
Permalink
Post by Charles Curley
Post by Levi Pearson
Post by Charles Curley
* dividing by a variable instead of an inline constant, usually
slower.
Your solution doesn't take 'base' as a paramter, so I can only assume
you've defined it as a constant global variable somewher else in your
program. In that case, your solution is just as fixed as mine. The
modifications to change the base for either are trivial, but require a
recompile.
You are correct that base is defined elsewhere; it is a global.
Again, this is embedded programming. A calling function can set base
before calling itoa. That's a bit more flexible than adding to your
lookup array, which will be in ROM at runtime.
My point was that your point of criticism only applied if the global
was also a constant. Otherwise, no constant folding, and you still
divide by a variable. Sure, you don't have to pass a parameter, but
we're getting into ugly optimization territory now. I'd never do that
unless circumstances demanded it. Clean, elegant code wins over
prematurely optimized code.

Make it correct, make it simple, and then, if it's not fast enough,
make it fast with as little compromise to the other two as possible.
Anyway, it's been fun. Thanks to smorrey for posting the question. :)

--Levi

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/
Charles Curley
2007-09-20 03:04:09 UTC
Permalink
Post by Steve
Here is the setup.
There is no standard way to convert an int into a char*, or at least
there is nothing in the standard library.
So I was asked how I would write an itoa function.
The goal was to target an embedded platform, and so clock cycles and
overhead really count.
The solution could be in either C or C++
I haven't written this in about eighteen years or so, and most of that
was in Forth or assembler. So it took me an hour to come up with a
pure C function with a test frame, and comment it. Some of that was
remembering how to do this in C.

As other folks have noted, the examiner didn't mention which base to
use. This is a general solution: any reasonable base and a lot of
unreasonable ones can be specified by setting the variable base. Doing
so I will leave as an exercise for the student.

As noted below, I am assuming a pure ASCII environment: no unicode, no
code pages. This would also work in a half-ASCII environment, i.e. no
lower case and no non-printing characters. (Yes, I've actually worked
on such a beastie.) What, if any, changes are necessary I will also
leave to the student.

I will also leave several optimizations I can think of to the student.

--------------------------------------------------
/* A subroutine to convert an int to a string representation
* thereof. */

/* Time-stamp: <2007-09-19 20:52:54 ccurley itoa.c> */

/* These should be available in most imbedded systems. stdio.h is
* only needed for the test frame anyway. */
#include <stdio.h>
#include <string.h>

#define SIZE sizeof (int)*2


/* TO DO: Add getopts, etc, so the user can set the base and test
* value. */
int base = 16;

int test = 0xffffffff;

/* We depend on a pure ASCII environment here. */
char digit (int d)
{
char ret = '0';

ret += d;

/* For bases greater than 10, skip from '9' to 'A' in the ASCII
* table. This may have been a trick your examiner was looking
* for. */
if (ret > '9') {
ret += 7;
}
return (ret);
}

char *itoa (int value, char *buffer)
{
int i = 0;
char tmp[SIZE+1];

tmp[SIZE] = 0;

/* For negative numbers, set the sign. Having done that, do the
* conversion as a positive number. */
if (value < 0) {
strcat (buffer, "-");
value = 0 - value;
}

while (value) {
i++;

/* Since we are starting at the right end of the string, we
* fill our temporary buffer from the right. This is likely
* another trick the examiner wanted. */
tmp[SIZE-i] = digit (value%base);
value = value/base;
}

/* I have no way of knowing the size of the incoming buffer, so I
* have to build in a buffer, then copy into the given
* one. strcat is likely more efficient and faster than strncat,
* and probably coded in assembler. */
strcat (buffer, &tmp[SIZE-i]);
}

main (int argc, char *argv[])
{
char buffer[23];

buffer[0] = 0; /* Force buffer to hold a string of
* length 0. Likely another trick the
* examiner wanted. */

printf ("%s, a program to exercise a subroutine to convert an int to a string.\n",
argv[0]);

printf ("0x%x converts to a string as %s.\n", test, itoa (test, buffer));
}
--------------------------------------------------
--
Charles Curley /"\ ASCII Ribbon Campaign
Looking for fine software \ / Respect for open standards
and/or writing? X No HTML/RTF in email
http://www.charlescurley.com / \ No M$ Word docs in email

Key fingerprint = CE5C 6645 A45A 64E4 94C0 809C FFF6 4C48 4ECD DFDB
Loading...