\chapter{Experimental Investigation} This chapter describes the results which were obtained from each of the Packers which were developed as the practical component of this thesis. For each Packer we present an overview of the experiment, the hypothesis for the experiment, the expected results, the actual results obtained and a discussion to provide an adequate understanding of what the results actually mean. \section{Standard CORBA implementation} \label{Standard} \subsection{Overview} The experiment relating to the standard CORBA implementation is designed to demonstrate the overhead that CORBA adds to the data transmission when using the standard CORBA marshalling algorithm. The results obtained from this experiment will then be compared to the results obtained from the octet marshalling algorithm and the low level communication mechanism to help aid in the validation of the hypothesis put forward for this experiment. \subsection{Hypothesis} \label{StandardHypothesis} The hypothesis of this experiment is to prove that by using the standard CORBA implementation and hence preserving the `OO' nature of programming that the throughput achieved will be slower than a low level communication mechanism implementation such as sockets. The second hypothesis put forward for this experiment is that the throughput associated with the standard CORBA marshalling algorithm when compared with the octet marshalling algorithm will be slower. \subsection{Expected Results} The results obtained for this experiment are expected to prove that the standard way of marshalling data in CORBA is less efficient then passing the same amount of data through a low level mechanism such as sockets. Additionally, the results expected from this experiment will also show that by using `octets' that it is possible to provide a data marshalling/packing algorithm which is comparable if not better than the default CORBA marshalling mechanism. \subsection{Actual Results} From the data represented in tables \ref{OneBit}, \ref{LowBit}, \ref{MediumBit} and \ref{BigBit} it is possible to see that the standard marshalling algorithm used to pack and transmit the data was consistently slower than the socket implementation by a significant margin. From these results it is then possible to work out the overhead that CORBA adds to the default marshalling by subtracting the time it takes to transmit the data in a TCP/IP format from the time it takes CORBA to transfer the data using its own mechanism. Based off the figures in table \ref{OneBit} the overhead can be calculated to be 52.8 seconds over 10,000 iterations or a 92\% slow down in throughput. Surprisingly, the standard CORBA implementation in Orbix was more efficient that the octet implementation. A number of ideas are canvassed in section \ref{CORBADiscussion} as to why the default standard CORBA marshalling implementation consistently beat the octet marshalling technique. \subsection{Discussion} \label{CORBADiscussion} The results presented in tables \ref{OneBit}, \ref{LowBit}, \ref{MediumBit} and \ref{BigBit} both validate and invalidate the hypotheses proposed for this experiment. The first hypothesis regarding the speed comparison to the sockets implementation was validated. This can be explained due to the extra amount of information which needs to be added to a standard CORBA implementation which results in a 92\% slow down. As has been mentioned in other sections (refer to section \ref{DescribeExperiment}) the CORBA infrastructure provides a considerable amount of overhead in order to provide the developer with a higher level of abstraction. This overhead comes in the form of bit conversion, protocol conversion and the actual marshalling of the data elements before transmission. The results generated by this experiment also lead to invalidating the second hypothesis regarding the speed of octet marshalling. As was mentioned earlier the standard CORBA marshalling algorithm consistently beat the octet marshalling algorithm on every occasion. This finding is quite surprising considering that the standard CORBA marshalling algorithm adds so much overhead to the transmission before it actually sends the data. From the results we can imply that IONA's CORBA implementation (Orbix) does not purely marshall the data and then transmit it. The results implicitly imply that Orbix's data marshalling algorithm is not written for genericity but is aware of specific data types and as a result treats them in different ways when marshalling. An optimisation method that seems to be used in Orbix is a polymorphic copy command which decides upon which method is the best for copying based on the data structure and then copies it. This was confirmed during the development of the Packers when a character array which was used to pass binary data around was not totally copied. The reason for the partial copy of the buffer was traced back to the binary bit stream having a binary zero within it. It appears that Orbix detects the character array and selects an appropriate copying method similar to \texttt{strcpy} which stops on the detection of a binary zero in the stream. As a result of a binary zero being a valid member in a bit stream a specialised member function to copy the buffer had to be written so as to be able to allow the buffer to be copied in a distributed environment. In addition to Orbix having a `polymorphic' copy function it is also suspected that the ORB performs other operations such as data compression to aid in the reduction of the transmission size. Another option that exists includes having the ORB during its protocol negotiation phase realising that the objects are residing on the same machine and hence the same architecture. This means that the ORB may turn off or eliminate all marshalling and encoding mechanisms that it employs when dealing with remote systems leading to a smaller amount of data yet again to transfer. Due to the ORB doing all of this extra work in the background and hence therefore reducing the size of the marshalled data it is therefore understandable as to why the standard marshalling algorithm continually out performs the octet marshalling. The main driving factor behind this being that the octet marshalling does not engage in any data compression nor any other activity other than marshalling the data and transmitting it. This means that the values returned from the octet marshalling are a true example of how much overhead actually goes into a standard CORBA marshall. \begin{table}[ht!] % The ! mark was used to force the table in at this position. {\scriptsize \begin{center} \begin{tabular}{|l|l|l|l|l|} \hline Packing Method & User Time (secs) & System Time (secs)\\ \hline \hline TCP/IP & 4.2 & 12.2 \\ \hline Structure Marshalling& 53.7 & 31.1 \\ \hline Standard Marshalling & 57.0 & 33.0 \\ \hline Octet Marshalling & 86.4 & 51.3 \\ \hline \end{tabular} \end{center} \caption{\label{OneBit} Time taken to transfer 8 bits of data} } \end{table} \begin{table}[ht!] % The ! mark was used to force the table in at this position. {\scriptsize \begin{center} \begin{tabular}{|l|l|l|l|l|} \hline Packing Method & User Time (secs) & System Time (secs)\\ \hline \hline TCP/IP & 8.1 & 16.4 \\ \hline Standard Marshalling & 67.1 & 41.0 \\ \hline Structure Marshalling& 75.6 & 41.3 \\ \hline Octet Marshalling & 131.3 & 75.8 \\ \hline \end{tabular} \end{center} \caption{\label{LowBit} Time taken to transfer 160 bits of data} } \end{table} \begin{table}[ht!] % The ! mark was used to force the table in at this position. {\scriptsize \begin{center} \begin{tabular}{|l|l|l|l|l|} \hline Packing Method & User Time (secs) & System Time (secs)\\ \hline \hline TCP/IP & 8.5 & 15.7 \\ \hline Standard Marshalling & 130.6 & 74.24 \\ \hline Structure Marshalling& 131.4 & 70.7 \\ \hline Octet Marshalling & 154.6 & 84.7 \\ \hline \end{tabular} \end{center} \caption{\label{MediumBit} Time taken to transfer 168 bits of data} } \end{table} \begin{table}[ht!] % The ! mark was used to force the table in at this position. {\scriptsize \begin{center} \begin{tabular}{|l|l|l|l|l|} \hline Packing Method & User Time (secs) & System Time (secs)\\ \hline \hline TCP/IP & 18.2 & 34.3 \\ \hline Standard Marshalling & 226.9 & 132.4 \\ \hline Octet Marshalling & 275.2 & 162.3 \\ \hline Structure Marshalling& 277.5 & 162.4 \\ \hline \end{tabular} \end{center} \caption{\label{BigBit} Time taken to transfer 360 bits of data} } \end{table} \begin{table}[ht!] % The ! mark was used to force the table in at this position. {\scriptsize \begin{center} \begin{tabular}{|l|l|l|l|l|} \hline Packing Method & User Time (secs) & System Time (secs)\\ \hline \hline TCP/IP & 24.9 & 45.3 \\ \hline Standard Marshalling & 322.1 & 218.6 \\ \hline Octet Marshalling & 302.2 & 223.5 \\ \hline Structure Marshalling& 337.9 & 226.1 \\ \hline \end{tabular} \end{center} \caption{\label{BigBigBit} Time taken to transfer 904 bits of data} } \end{table} \begin{center} \begin{figure}[ht] \begin{minipage}[t]{15.0cm} \begin{center} \epsfxsize=500pt \centerline{\epsfbox{results.eps}} \end{center} \end{minipage} \caption{\label{GraphResult} Graphical Representation of the Data Marshalling Performance} \end{figure} \end{center} Additionally, the results shown in table \ref{BigBit} shows that the standard data marshalling algorithm takes a considerable amount longer when a more complex type such as a CORBA string is introduced. This is obvious due to the times required to marshall a 168 bit structure as opposed to a 160 bit structure and the times required to marshall the 168 bit data structure and the 904 bit data structure (refer to tables \ref{MediumBit}, \ref{BigBit} and \ref{BigBigBit}). To calculate the performance slowdown when there is a change in the complexity of the data we can take the user time required to marshall the 160 bit and 168 bit structures and calculate the change in time. After calculating the difference in transmission periods it is possible to see that there is a 17.71\% overhead incurred when the data changes slightly in complexity. In figure \ref{GraphResult} it is possible to see two peaks which appear in the graph. These peaks can be attributed to the introduction of a more complex data structure (ie. the addition of a character pointer in the data structure) and CORBA generating a new data structure big enough to hold the data. This memory allocation was conducted by CORBA and not by the operating system as all CORBA implementations including the standard marshalling method, the structure marshalling method and the octet marshalling method all showed signs of having a decrease in their throughput. This same peak was also experienced at 904 bits but this decrease in throughput did not only effect the CORBA methods of data transfer but also the TCP/IP method. This would seem to suggest that the slow down was due to memory being allocated on a system wide basis rather than being specific to one particular marshalling method or framework. From these results it can be expected that as we marshall more data we will experience an increase in the number of periods where there is a decrease in the efficiency of data marshalling. \section{Octet Data marshalling/packaging mechanism} \label{OctResults} \subsection{Overview} Octet data marshalling is meant to be a method which can be used by developers to minimise the amount of overhead that CORBA adds to marshalling. This experiment relates to the octet marshalling implementation being designed to demonstrate the lack of overhead that CORBA adds when it is preparing data for transmission. The results obtained from this experiment will then be compared to the results obtained from the standard CORBA implementation algorithm and low level communication mechanism to help aid in the validation of the hypothesis put forth for this experiment and in other experiments. \subsection{Hypothesis} \label{OctetHypothesis} The hypothesis for this experiment is to prove that by implementing a marshalling algorithm using octets that it would be more beneficial with regard to throughput rather than the standard CORBA marshalling algorithm which was detailed in section \ref{Standard}. This means that we are expecting fast times from the marshalling algorithm. Additionally, this experiment also aims to prove that by using the octet marshalling mechanism, that it is still impossible to match the throughput that can be achieved by a low level communication mechanism such as sockets. \subsection{Expected Results} From this experiment we can expect to find results which prove that when we use the octet form of marshalling that we are able to substantially reduce the amount of overhead which is added to the data transmission stream. Additionally, the results expected from this experiment will be used to show that the octet marshalling algorithm still has significant overheads attached to it when compared with a low level communication mechanism such as sockets. \subsection{Actual Results} The results show in tables \ref{OneBit}, \ref{LowBit}, \ref{MediumBit}, \ref{BigBit} and \ref{BigBigBit} that the first hypothesis put forth for this experiment is nearly invalidated. The only exception to the results invalidating the first part of the hypothesis is the time it takes to marshall a structure through CORBA. In this instance it was actually faster to use the octet marshalling routine rather than the standard CORBA implementation (see tables \ref{BigBit} and \ref{BigBigBit}. According to table \ref{BigBit} there is a 2.3 second difference over 10,000 iterations. This 2.3 second difference between implementations demonstrates that Orbix takes an extra amount of time to marshall a data structure which contains embedded structures. This improvement in the data marshalling can also be seen in table \ref{BigBigBit} where there is a 20 second difference over 10,000 iterations between the octet marshalling method and the standard marshalling algorithm. Overall these results imply that it takes an extra amount of time for the data to be marshalled when performing the marshalling using the octet data stream. Reasons for why the octet data marshalling algorithm performed so badly can be found in sections \ref{CORBADiscussion} and \ref{OctetDiscussion}. Additionally, the results from this experiment also aid in the validation of the second hypothesis regarding the throughput which can be achieved using a low level communication mechanism. When the times are compared for octet marshalling and socket communications it is possible to see the octet version taking up to four times as long when compared with socket marshalling. An explanation for this result can be found in section \ref{OctetDiscussion}. \subsection{Discussion} \label{OctetDiscussion} As was mentioned previously in section \ref{CORBADiscussion} the octet marshalling algorithm may look worse than it actually is when compared to the standard marshalling algorithm due to the clever tricks that the ORB might be performing in the background. These tricks which are used to improve the performance of the standard marshalling algorithm include data compression and specialised member functions which are optimised for handling CORBA OMG IDL data types. One reason for why the standard marshalling mechanism is so more efficient than the octet marshalling scheme is because the octet marshalling scheme does not use any of the potential tricks that have been discussed to decrease the size of the marshalled data. An additional reason for why the octet stream might be slower is due to the extra processing that is needed to be performed in order to get the data stream ready. As the `octet' type disables all the interventions from the ORB it becomes the developers responsibility to build all the functions which are responsible for marshalling the data and hence the ability for inefficient code arises. This excess preparation could lead to times which are relatively slower than the optimised code the ORB would be making use of to marshall the data. This could explain the dramatic difference in some cases between the time that it takes to marshall an octet stream vs. the time to marshall a standard CORBA stream (refer to table \ref{LowBit}). With regard to comparing the throughput of the octet marshalling algorithm to the sockets communication mechanism it is possible to see why there is such a great difference between the two in terms of throughput. As was mentioned earlier the sockets implementation has very little overhead associated with it. By using sockets we avoid the extra abstraction which both the standard CORBA and octet implementation force upon us such as connection management, bit conversion, byte ordering and protocol conversion. The only overhead that the socket implementation has to deal with concerns the network header information which is written to the front of every packet that is transmitted and the associated processing time involving in prefixing the header information. \section{Low Level communication mechanism (TCP/IP Socket)} \label{SocketResults} \subsection{Overview} The aim of this experiment is to prove that by using a low level communication mechanism it is possible to by-pass all of the overhead associated with the CORBA infrastructure which is added during data transmission. The results obtained from this experiment will then be compared to those values obtained from the experiments detailed in sections \ref{Standard} and \ref{OctResults} so as to be able to determine what overhead is added when using those data marshalling techniques. \subsection{Hypothesis} \label{TCPIPHypothesis} The hypothesis put forward for this experiment is to prove that by using a low level communication mechanism it is possible to eliminate a significant amount of the overhead which is added to the transmission by the other two data marshalling algorithms outlined in sections \ref{Standard} and \ref{OctResults}. \subsection{Expected Results} The results expected from this experiment are aimed to provide times which will be significantly better at marshalling a complex heterogeneous data through a socket structure rather than the other two marshalling algorithms (standard, octet) which have been tested in previous experiments. \subsection{Actual Results} The figures shown in table \ref{BigBit} provide adequate support in validating the hypothesis put forward for this experiment. From the table it is possible to see that it only took 34.3 seconds to marshall and transmit the 353 bit structure 10,000 times as compared to 132.4 seconds for standard marshalling and 162.3 seconds for octet marshalling. This trend of the socket implementation performing better than the other implementations can also been seen in tables \ref{OneBit}, \ref{LowBit} and \ref{MediumBit}. An explanation of these results can be found in section \ref{SockDiscussion}. \subsection{Discussion} \label{SockDiscussion} From the results just mentioned it is obvious to see that the socket marshalling mechanism is the most effective way to transmit the data. This is due to the socket marshalling mechanism having very little overhead associated with it other than the general network traffic/header information which automatically gets prefixed at the beginning of every data packet. Unfortunately, this speed comes at a price. The socket marshalling mechanism might be a fast way of transmitting data but it suffers from the reality of not being able to deal with data in an object oriented manner. Sockets are designed only to deal with data transmission and hence are not capable of dealing with behaviour or encapsulation while transmitting the data. This leads to the sockets implementation being very similar to the RPC style of programming which also suffers from this problem. This problem stems from the fact that by using a low level communication mechanism that the developer lacks the necessary abstraction from the problem domain. This is common in reality when dealing with low level mechanisms as the developer is responsible for providing low-level routines which are responsible for establishing the connection, maintaining the connection, marshalling the data, transmitting the data, reading the data and managing the differences between the machine architectures. Having to provide this level of code leads to the developer having to get to a very low level and write non-portable code which makes the application rather useless in a dynamic distributed environment as was discussed in \citeN{SCHMIDTMAY95}. The next chapter provides a summary of all the results which have been generated from the Packers and from these results we form a number of conclusions. Additionally, we also answer the research questions which were proposed back in the introduction.