common libcurl mistakes
Documentation HTTP method CURLOPT_NOSIGNAL
Return codes Certificate checks -DCURL_STATICLIB
Verbose option Zero termination Set the URL
curl_global_init C++ strings callback invokes
Redirects Threading C++ methods
Why are these mistakes made?
Humans are lazy
Copy and pasted from questionable sources
Documentation is hard
Internet transfers are complicated
Maybe, just maybe, the curl way isn’t always the smartest...
Skipping the documentationSkipping the documentation
Lots of options have plain English names
Might trick you think you know what it does
Still might not work like you presume it does
Copy and paste from random web sites
There are also details
The devil is always in the details
Lots of documentationLots of documentation
We offer man pages for every setopt option
We host over 100 stand-alone examples
Consider which docs you rely on (hello
Failure to check return codesFailure to check return codes
Return codes areReturn codes are usefuluseful cluesclues
How to know if the call succeeded?
How to know why something doesn’t do what you expected?
What if the feature isn’t even built-in?
Our example source codes might be bad examples
Forgetting the verbose option
Strange, how come it doesn’t work?
Hm, why does it act like this?
/* please be verbose */
rc = curl_easy_setopt(hnd, CURLOPT_VERBOSE, 1L);
/* provide a buffer to store errors in */
curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, errbuf);
libcurl or content?
By using verbose, you’ll spot if this was libcurl that said it or if this
was actual content delivered from the server!
Error 505: HTTP Version Not Supported
Maybe even in production?
Consider it for debug options
Direct the output somewhere suitable with
There's a global init function
It is called implicitly by curl_easy_perform() if not done
Not calling it means relying on default, implicit behavior
It typically then implies not calling curl_global_cleanup()
This may result in not releasing all used memory (“Dear sirs,
why does valgrind report that...”)
curl_global_init isn't thread-safe
curl_global_init needs to be called as a singleton
It is not thread-safe due to legacy and “reasons”
Will hopefully be rectified in a near future
There's a global init function!
Call curl_global_init first
Call curl_global_cleanup last
Let users set (parts of) the URL
Scheme (maybe even use another protocol?)
Host name (maybe target a malicious server)
Extreme lengths (pass in 2GB of data?)
Also consider other inputs: user name, password etc risk
Set only a limited part of the URL
Disabled certificate checks
Widely abused and misunderstood
Only use while experimenting / developing
Never ship in production
This also goes for HTTPS proxies
SCP and SFTP is different
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, 0L);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0L);
Verify server certificates!
Avoid man-in-the-middle attacks
HTTPS is not secure without it!
May require regularly updating the CA store
Assume zero terminated data in callbacks
CURLOPT_WRITEFUNCTION and CURLOPT_HEADERFUNCTION set
Libcurl provide data to the application using these callbacks
The data is provided as a pointer to the data and length of that data
When that data is primarily text oriented, many users wrongly assume
that this means the data comes as zero terminated “strings”.
size_t write_callback(char *dataptr, size_t size, size_t nmemb, void *userp);
C++ strings are not C strings
libcurl provides a C API
C and C++ are similar
C and C++ are also different!
C++ users like their std::string types
C++ Strings are not C strings
curl_easy_setopt() takes a vararg...
C++ string bad code
// Keep the URL as a C++ string object
// Pass it to curl
curl_easy_setopt(curl, CURLOPT_URL, url);
C++ string good code
// Keep the URL as a C++ string object
// Pass it to curl as a C string!
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
libcurl is thread-safe but there are caveats:
1) No concurrent use of handles
2) OpenSSL < 1.1.0 need mutex callbacks setup
3) curl_global_init is not thread-safe
Signals is a unix-concept: “an asynchronous notification sent to a
process or to a specific thread within the same process in order to notify it of
an event that occurred”
Signals are complicated in a multi-threaded world and
when used by a library
What does libcurl use signals for?
When using the synchronous name resolver, libcurl uses alarm()
to abort slow name resolves (if a timeout is set), which ultimately
sends a SIGALARM to the process and is caught by libcurl
libcurl installs its own sighandler while running, and restores the
original one again on return – for SIGALARM and SIGPIPE.
Closing TLS (with OpenSSL) can trigger a SIGPIPE if the connection
Unless CURLOPT_NOSIGNAL is set!
What does CURLOPT_NOSIGNAL do?
It stops libcurl from triggering signals
It prevents libcurl from installing its own sighandler
Generated signals must then be handled by the libcurl-
Creating and using libcurl statically is easy and convenient
Seems especially popular on Windows
Requires the CURL_STATICLIB define to be set when building your
Omission causes linker errors:
"unknown symbol __imp__curl_easy_init”
Because Windows need __declspec to be present or absent in the headers
depending on how it links!
Static builds mean chasing deps
Libcurl can use many 3rd party dependencies
When linking statically, all those need to be provided to the linker
The curl build scripts (as well as your application linking) usually
need manual help to find them all
(Sibling to the C++ strings mistake)
C++ class methods look like functions
C++ class methods cannot be used as callbacks with
… since they assume a ‘this’ pointer to the current object
Static member functions work!
A C++ method that works
// f is the pointer to your object.
static size_t YourClass::func(void *buffer, size_t sz, size_t n, void *f)
// Call non-static member function.
// This is how you pass pointer to the static function:
curl_easy_setopt(hcurl, CURLOPT_XFERINFOFUNCTION, YourClass::func);
curl_easy_setopt(hcurl, CURLOPT_XEFRINFODATA, this);
Write callback invokes
Data is delivered by callback (CURLOPT_WRITEFUNCTION)
It might be called none, one, two or many times
Never assume you will get a certain amount of calls
Independently of the data amount
Because of network, server, kernel or other reasons