I can confirm that I have built for many embedded systems and can nearly always achieve builds in the 80 - 120K range with the same basic setup:
RSA enabled, ECC enabled, AES or 3DES, SHA256 and maybe SHA512/384 + TLS functionality. This setup will typically give me a few PFS cipher suite options. If I need additional features the library can grow a few K here or there. What I often find very interesting is the compiler differences. For example one project I was working on I had the exact same settings and source files and I built with two different compilers. The first, an ARM THUMB compiler, output a binary test application that was 80k. With the second toolchain/compiler (granted, main.c was a bit different due to different devices being targeted), I saw a 240K binary executable output. This wasn't because of anything with the library, rather it was that we just didn't have the compiler tuned right to optimize. Without changing anything in the wolfSSL library or wolfSSL settings I was able to tune the compiler rules a bit and quickly get the 240K build down to 160K. Obviously this was still double the size of what the other compiler was capable of but goes to show just how much the toolchain itself can effect the resulting binary output size.
Can you tell us what compiler you are working with and if it is setup for optimization? Can you tell us also if you are just looking at the library itself or are you looking at the executable binary size? (Executable binary is much different than a library object as the library object does not get optimized until it is linked into an executable binary).
Like Eric mentioned if you would like to discuss in more detail feel free to reach out to firstname.lastname@example.org.