i'm trying parse http request, , have been doing using strtok()
, running problems when trying use strcpy()
.
i can parse file path , file name fine, can't seem parse remote host dns name. below code should tokenize string , dns name, store in char[]
called host
.
#include <stdio.h> #include <time.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> #include <unistd.h> #include <stdlib.h> int main() { int c = 0, c2 = 0; char *tk, *tk2, *tk3, *tk4; char buf[64], buf2[64], buf3[64], buf4[64]; char host[1024], path[64], file[64]; strcpy(buf, "get /~yourloginid/index.htm http/1.1\r\nhost: remote.cba.csuohio.edu\r\n\r\n"); tk = strtok(buf, "\r\n"); while(tk != null) { if(c == 1) { tk2 = strtok(tk, " "); while(tk2 != null) { if(c2 == 1) { printf("%s\n", tk2); strcpy(host, tk2); // printf("%s\n", host); } ++c2; tk2 = strtok(null, " "); } } ++c; tk = strtok(null, "\r\n"); } return 0; }
bear me, i'm new c programmer , code may ugly. every time try running program, segmentation fault (core dumped)
error, , believe has strcpy()
. can print out tokenized string fine, can't seem copy char[]
.
sorry, strtok(3) function not parse http @ all. despite of this, i'll try explain what's happening in code.
- the first time, enter loop
tk=="get /~yourloginid/index.html http/1.1"
, , buffer has been changed"get /~yourloginid/index.htm http/1.1\0\nhost: ..."
.c==0
, won't if block, you'llc
variable incremented ,tk=strtok(null, "\r\n");
called again second line. - the second time, enter loop
tk=="host: remote.cba.scuohio.edu\r\n..."
, strtok(3) jumped on first\0
in string, skipped\r
,\n
characters, , got (strtok has put second\0
after part, leadingtk=="host: remote.cba.scuohio.edu\0\n..."
.c==1
time, inside if block , callstrtok(tk, " ");
. makes strtok(3) to forget extent of string parsing, , begin new parse onhost: remote.cba.csuohio.edu"
(as passed first non-null argument), returntk=="host:"
, putting\0
after"host:"
. second time enter inner loop, copy valuehost
variable. - the third time enter main loop, have
tk==null
last time calledtk=strtok(null, " ");
returnednull
(in inner loop), strtok continue returningnull
until initialize again, passing first non-null argument.
strtok(3) operates on string passed first parameter (writing info on it) , modifies it. further, has global hidden variable mark end of string parsing, able return null
when finished parsing. if nest calls strtok(3) undefined behaviour, loose internal state of function when initialize again, passing non-null first parameter. reason of fail.
calling strtok(3) has numerous drawbacks , cannot nested in several nested loops stores internally state related parsing. it's deprecated use. if want nestable, have switch strtok_r(3) instead. function has parameter allows save externally strtok internal state, can have several strtoks working in parallel.
further, strtok parse ok "get_/~yourlogin..."
"get___/~yourlogin..."
(i have used underscores represent spaces show multiple spaces between method name , uri) , latter not permitted http. same reason, can "host:remote.cba.csuohio.edu"
valid header field (however, use discouraged) , not parse correctly that. also, host:
header field might not first line in http header, can skip if not carefull.
if want parse http, first reading can recommend rfc-2616, "hypertext transfer protocol - http/1.1", mandatory document comply implementors. beware, it's dense , large document.