View Full Version : probs in ending a string
hello all,
tell me isnot <CF><LF> ie, \r\n together put as '\0'?
i am also confused what to put as the end of line char?, as per smtp, they ask for <CR><LF>, but if i put ist so, then i will have to check all the chars which is recved using a for loop and stop at \r\n, i was planning for \0 as it can be retrived using strlen(). advise me.
with thanks
-dev
RobSeace
05-13-2003, 01:09 PM
No, '\0' is a null char... It's got nothing at all to do with CR/LF
("\r\n")... You won't typically find any null chars being transmitted
in any of those text-based protocols, like SMTP, HTTP, etc...
They generally limit themselves to displayable 7-bit ASCII chars...
Yes, if you want to send commands, you have to terminate them
with "\r\n"... (Actually, most good servers will be lenient, and
accept just "\n", but still the specs call for "\r\n"...) And, yes, if
you want to read responses, you typically want to read lines of
text terminated by "\r\n"... So, you either want something that
reads a single char at a time until it hits '\n', or you want to create
a buffering system that reads in blocks and splits them at '\n', or
you want to use some existing similar system such as stdio...
vikasgp
05-13-2003, 01:27 PM
Yeah, it's quite some work to look for those CRLFs, especially in protocols like POP where the message ends with CRLF.CRLF
It makes life easier if the response is first split into lines ( with CRLF as line terminator ), and perhaps tag all the lines into a list. Then you can look at the individual lines to determine when transmission ends.
hello Rob,
but still, what could be potential probs if i go to use '\0' for end of line. as my project need not be fully based on the RFC like taht of SMTP and others, what could happen if i use '\0' to end the line.
-dev
hello Rob,
but still, what could be potential probs if i go to use '\0' for end of line. as my project need not be fully based on the RFC like taht of SMTP and others, what could happen if i use '\0' to end the line.
-dev
Sorry to intervene...
If you are "creating" your own protocol based on RFCs from SMTP and you just want to make it easier to read the lines, then the '\0' might seem better than '\r\n', however, I suggest you don't use it.
Example:
If you send() a couple of lines "this is line 1" (terminated by '\0') "this is line 2", and the receiver recv()s both messages in one recv(), then you will have a buffer like this: "this is line 1\0this is line 2\0", if you use strlen() on this buffer, it will return 14, so anyway you will have to parse the string and move in it as long as you find data.
The best approach (what Rob has already explained) would be to read into a buffer, and then search for the <eol> token (you can use just '\n' which will be understood by some MTA, although is not completely compliant) and do the search just like you'd do with the '\0' terminator.
If you want to know how to do this, you can take a look at strtok(3)'s man page, strchr(3)'s man page (the last one preferred over the first one)
Basically, what you'd do is:
--- Pseudocode ---
recv(socket, buffer, MAX_CHARS, 0);
p = buffer;
e = NULL;
while (p != NULL) {
// e = pointer to '\n' in the buffer
e = strchr(p, '\n');
if (e != NULL) {
// New line starting at p, ending at e
// Supposing we have a copy_string() function!!!
copy_string(p, e);
}
p = (e==NULL)?NULL:(e+sizeof(char));
}
This code is ugly, plagued by errors, not very functional, and probably not the best way to do this. Work on it and you will have the appropriate code for achieving what you want.
The other way to do it would be using higher level I/O functions:
For example:
FILE *bfile;
int socket;
char buffer[MAX_LENGTH+1];
// Socket connected or something
...
// Assign the file descriptor to a higher level structure for buffered I/O
bfile = fdopen(socket, "r");
// Now use standard functions for reading line-by-line
fgets(buffer, MAX_LENGTH, bfile);
...
Read man pages, elaborate on the code you have seen here, or create your own.
Search on these forums for other questions related and you might find code examples.
RobSeace
05-13-2003, 07:38 PM
Yeah, I agree with Loco: I don't think using '\0' instead of "\r\n" is
really going to buy you anything, and might well make matters
even harder to deal with...
If you want a truly easy to parse protocol, design so that prefixing
every message sent is a header which gives the total length of
the message to read... Then, you can read that (fixed-sized)
header, and from that know exactly how much to read()/recv()
to get a single, complete message... However, if you want it
to be text-based, so people can interact directly with it via
telnet and such (like SMTP, HTTP, etc.), then you probably
don't want that approach... But, if you only care about it being
parsed by your code, then that's always the easiest way to go,
IMHO...
hello all,
thanks, i got that prasing the whole recicved buffer is better then searching for \0 as such, but still if i am to search \r\n or \n using library or user defined function, what difference will it make if i use \0 or \n? is the reason being that , the chances of \0 appering in between a message or data send is more than \r\n? or is it historical. i donot have to provide comptability with old or any other system, as mine is to be coustomize application.
also wouldnot the prob is example given by Loco, be there if i used \n?, if i send 2 lines terminated by \n , and search for \n using library fucntion , return first line? or if i use user defined function to search <eol> again either i may stop at first \n and return the first line leaving the second, or keep searching till end and if remaining string contain \n as garbage , then return an invaild string!
(actully, as my system is purely command/responce system, with each meesage limited to single line( other than data transfer/file transfer, whose size if prefixed as Rob said,thus creating no probs), cannot i be assured the only one line is to be recived at a time?). well if i think from the hackers point of view, well things gets complicated,as they can use my comprimises made to break or hit the system!)
prefixing the len of message before the message is a good idea, iam using it for file transfer,(prefixing structure with file size and other info), but for command/responce which are string or text , it as u said couldnot be good idea.
so may i conclude that use of \r\n or \n is more historical than logical?
-dev
RobSeace
05-14-2003, 01:53 PM
Well, I suppose you could say that... It's just the action of a
carriage-return and line-feed make sense as a method of ending
a line of text, because it places the cursor in a location for starting
another new line (when interpreted by pretty much any sort of
sane terminal device)... So, it then also makes sense to use
CR+LF as an EOL marker, for that very reason (that they are
pretty much designed for the sole purpose of "ending a line", and
moving onto another one)... You can see it as arbitrary and
historical, I suppose; but, it certainly seems quite logical to ME...
However, the null character also makes sense as a way to end
a string... Not a LINE, per se, but just a string of text... But, sure,
you could certainly logically use it as a sort of EOL marker, in
the right context... However, it requires more careful handling
on your part than CR+LF, since so many standard lib functions
treat '\0' as the end of the string, and won't continue past it, when
perhaps you might want them to... Agreed, it can certainly be
overcome, but it IS an issue to at least be aware of, if you're going
to use it... And, also, since the whole point was to have a
text-based protocol which could be typed in by a human via
telnet (or nc, or similar), then I think you'll find CR+LF will be a
lot easier for them to try to type than a null character... (Especially
if you allow just a LF to terminate the line, without a CR...) And,
if you don't need to allow for human interaction, then as I said,
there's no reason to need any kind of EOL marker at all: just use
a message size header...
hi all,
thanks for the help
-dev
vBulletin® v3.7.4, Copyright ©2000-2009, Jelsoft Enterprises Ltd.