In this file: [overview] [optimizations] [todo] ======================================================================= OVERVIEW The IDL compiler as a whole is composed of these modules: - libIDL Reads in a .idl file, filters it through the C preprocessor, and parses it into a tree of data structures that represent the .idl file. - driver - C backend Takes the IDL_tree (and associated information) from the driver and outputs a header file, client-side stubs, server-side skeletons, and miscellaneous routines for a specific interface. - header The typedef's and function prototypes for modules/interfaces - stubs client-side marshal/send request/demarshal routines - skeletons server-side demarshal/upcall/marshal routines - common routines such as the type allocation/de-allocation routines that are needed by both stubs & skeletons. All four routine categories use the same basic idea of recursively processing the IDL_tree and acting on "interesting" IDL elements as they are found by outputting the appropriate C. -- Elliot ======================================================================= OPTIMIZATIONS . When marshalling or demarshalling a parameter, if you know the alignment of the previous parameter, you can know whether you need to align the current parameter. [implemented, AFAIK] -ECL . When doing byteswapping, we should do it in place instead of using the function pointer. [NYI, but is very to change - just rewrite GET_ATOM()] -ECL . When demarshalling structures with only fixed-size elements in them, we should be able to memcpy directly off the wire. In the normal case, for a struct { int int1; char *string1; int int2; }; we have to pull off an int, then pull off a string, then pull off an int. Now consider struct { int int1; float float1; }; If we have to byteswap this, no gain. But in the "don't need to byteswap" case, we can directly memcpy() this struct from the raw data buffer for a nice gain. -ECL [Implemented] . there needs to be a way to say "add an iovec to the list" so that things like constant strings can be marshalled super-easily. [implemented] [Note that since ORBit's IIOP module uses writev() to send out data, and that list of vectors is generated to point at the data to be sent, ORBit will probably perform extremely well if you send large arrays or sequences of basic types, or strings, across the network. If you're just calling a void dosomething(long n1, long n2, long n3); all day, it will probably be less than optimal. If, on certain architectures, we can count on n1, n2, and n3 to be consecutive in memory, it might be possible for the IDL compiler to recognize this and output code that does marshal_value_at_address(&n1 /* base address */, sizeof(long) * 3 /* length in bytes */); (the giop_message_buffer_append*() routines already try to recognize appends of consecutive memory regions and coalesce them, but it's untested, and slower than the above) ] If an optimization will slow the IDL compiler down by an order of magnitude, that's fine - the idea is to do lots of work at compile time in order to save work at runtime. . For the server-side, we can use gperf to generate a nice hash of the operation names that we know at compile time, for doing the operation name -> class_specific_POA_data conversion. [This would give a nice gain - any takers?] [implemented, not using gperf but a switch statement.] . Use alloca() in skels to get memory. [Not needed - we just have straight variables on the stack, which is even faster ;-] [We should use alloca() instead of typename__alloc() whenever possible] . For the last generated demarshaller that will ever use a recv_buffer, we don't need to increment the recv_buffer->cur pointer afterwards. [A pain to implement.] . Direct mem append of string lengths instead of indirect, in certain cases. . Given a known number of fixed-length values, marshal them into an on-stack buffer and then pass that in one call to append_mem. . Can read & write from a socket at the same time. (multithreading).