connecting-libcurl-and-libuv

I just followed an example how to make libcurl and libuv working together. I removed usage of global variables and tested the performance by downloading (not writing to disk) 4.4GB file. It works great. It is comparable to my experience with completion ports on Windows.

libcurl: 7.66.0
libuv: 1.33.1
url: http://saimei.ftp.acc.umu.se/debian-cd/current-live/amd64/iso-hybrid/debian-live-10.2.0-amd64-lxqt.iso
code: 200

real	2m37,210s
user	0m2,297s
sys	0m6,527s

The source code of my playground: https://github.com/amacal/peak/tree/5810adf59a026fc5f630e57791fc463f0452701c

recovering-c-skills

I did not use C for about 15 years. I did never program in C at work. Recently I felt I want to satisfy my low level programming needs. I started with preparing my vim environment, wrote Makefile and simple usage of the libcurl library.

#include <stdio.h>
#include <curl/curl.h>

size_t on_write(char *ptr, size_t size, size_t nmemb, void *userdata) {
  return nmemb;
}

int on_progress(void *clientp, curl_off_t dltotal, curl_off_t dlnow, curl_off_t ultotal, curl_off_t ulnow) {
  printf("total %zu, now %zu\n", dltotal, dlnow);
  return 0;
}

int main(int argc, char** argv) {
  CURL *easy_handle = curl_easy_init();

  curl_easy_setopt(easy_handle, CURLOPT_NOPROGRESS, 0);
  curl_easy_setopt(easy_handle, CURLOPT_URL, "https://example.com/");

  curl_easy_setopt(easy_handle, CURLOPT_WRITEFUNCTION, on_write);
  curl_easy_setopt(easy_handle, CURLOPT_XFERINFOFUNCTION, on_progress);

  curl_easy_perform(easy_handle);
  curl_easy_cleanup(easy_handle);

  return 0;
}

testing echo server

I was just wondering how I can test an echo server. The conclusion was to send a lot of random data and compare sent and received data. The first part I managed by calling openssl tool included in Git for Windows distribution to generate about 1.8GB random text. To send all the generated data I used NetCat for Windows.

openssl rand -base64 10000000000 -out /z/random.txt
cat /z/random.txt | nc 127.0.0.1 54443 > /z/received.txt

echo server

I am trying to write quite complex and scalabe network application using only ANSI C and WinAPI to gain a lot of performance. I am not ready to do it in one iteration. The first step was just to create simple echo server. The solution was implemented in about 4 hours. It uses only single thread, but it is very concurrent thanks to Windows IOCP.

Souce Code: GitHub
Windows IOCP: Documentation

almost json deserializer

I wanted to solve the following problem:

  • process 20 million of relatively small JSON files in a projection
  • each JSON file is valid and contains up to 100 properties
  • there is about 30 different JSON formats
  • there is about 50 different projections
  • each projection uses only few properties of each JSON format
  • each projection resides in separated git repository

With the following constraints:

  • the processing should take about 10 minutes
  • the number of JSON formats will increase
  • the number of projections will increase
  • the projection should report used properties

And the following preferences:

  • I don’t want to share or maintain JSON formats as csharp code
  • I want to use the projection code to report used properties

I wrote the prototype and I found out that:

  • the bottleneck is JSON deserialization into dynamics
  • the Newtonsoft.Json deserializer is very slow
  • the Jil is faster but still slow

I wrote my own deserializer which:

  • deserializes into dynamics
  • parses only valid JSON
  • maximum JSON size is 64kB
  • there is not JSON indentation
  • deserialized object should be consumed before deserializing next one

I compared the results and my implementation is as fast as static Jil or NetJSON. Sometimes is even faster.

Interested? Check it out: https://github.com/amacal/jynd

endianness

Recently I was working on torrent encryption protocol, which uses Diffie-Hellman key exchange. I used .net built-in System.Numerics assembly which offers the BigInteger structure. Event ModPow method was included. “Great, there is event ToByteArray method”, I thought. Then I spent two days of debugging because I didn’t check the byte order returned by this method. Why does Microsoft always implement things this way? As a developer I would expect to have the following signature of the BigInteger structure:

public struct BigInteger : // some interfaces
{
   public BigInteger(byte[] value);
   public BigInteger(byte[] value, ByteOrder endianness);

   public byte[] ToByteArray();
   public byte[] ToByteArray(ByteOrder endianness);

   // other members
}

anonymous types and dynamics

Anonymous types are compiled as internal. What is the impact of it? You can still inspect them using reflections, but you cannot access their properties from other assemblies using dynamic keyword. It will throw RuntimeBinderException.

count(*) vs count(name)

What do you expect from the following query?

select count(*) as total,
       count(id) as by_id, 
       count(first_name) as by_first_name,
       count(last_name) as by_last_name,
       count(birthdate) as by_birthdate
from people

If the people table has the following data:

| id | first_name | last_name | birthdate |
+----+------------+-----------+-----------+
| 1  | John       | Doe       |           |
| 2  |            | Anonymous |           |

The answer is:

 | total | by_id | by_first_name | by_last_name | by_birthdate |
 +-------+-------+---------------+--------------+--------------+
 | 2     | 2     | 1             | 2            | 0            |

It seems that all NULLs are not considered.