What values are allowed in syslog tag or app names?
The answer is: Any (non-special) US-ASCII character that is not a space, with a maximum length of 48 characters.
To trace this through the relevant syslog protocol RFCs, read on.
The BSD Syslog Protocol
According to RFC 3164, it must be an alphanumeric string of at most 32 characters.
4.1.3 MSG Part of a syslog Packet
The MSG part will fill the remainder of the syslog packet. This will usually contain some additional information of the process that generated the message, and then the text of the message. There is no ending delimiter to this part. The MSG part of the syslog packet MUST contain visible (printing) characters. The code set traditionally and most often used has also been seven-bit ASCII in an eight-bit field like that used in the PRI and HEADER parts. In this code set, the only allowable characters are the ABNF VCHAR values (%d33-126) and spaces (SP value %d32). However, no indication of the code set used within the MSG is required, nor is it expected. Other code sets MAY be used as long as the characters used in the MSG are exclusively visible characters and spaces similar to those described above. The selection of a code set used in the MSG part SHOULD be made with thoughts of the intended receiver. A message containing characters in a code set that cannot be viewed or understood by a recipient will yield no information of value to an operator or administrator looking at it.
The MSG part has two fields known as the TAG field and the CONTENT field. The value in the TAG field will be the name of the program or process that generated the message. The CONTENT contains the details of the message. This has traditionally been a freeform message that gives some detailed information of the event. The TAG is a string of ABNF alphanumeric characters that MUST NOT exceed 32 characters. Any non-alphanumeric character will terminate the TAG field and will be assumed to be the starting character of the CONTENT field. Most commonly, the first character of the CONTENT field that signifies the conclusion of the TAG field has been seen to be the left square bracket character ("["), a colon character (":"), or a space character. This is explained in more detail in Section 5.3.
The Syslog Protocol
This BSD Syslog Protocol is obsolete. It has been superseded by RFC 5424, the Syslog Protocol, which specifies:
The MSG part of the message is described as TAG and CONTENT in RFC 3164. In this document, MSG is what was called CONTENT in RFC 3164. The TAG is now part of the header, but not as a single field. The TAG has been split into APP-NAME, PROCID, and MSGID. This does not totally resemble the usage of TAG, but provides the same functionality for most of the cases.
(Tools like logger will likely put the --tag
value into the APP-NAME field in their implementation).
APP-NAME is described in section 6.2.5:
The APP-NAME field SHOULD identify the device or application that originated the message. It is a string without further semantics. It is intended for filtering messages on a relay or collector.
The NILVALUE MAY be used when the syslog application has no idea of its APP-NAME or cannot provide that information. It may be that a device is unable to provide that information either because of a local policy decision, or because the information is not available, or not applicable, on the device.
This field MAY be operator-assigned.
And the relevant ABNF for this field is as follows:
APP-NAME = NILVALUE / 1*48PRINTUSASCII
PRINTUSASCII = %d33-126
US-ASCII range 33-126 is any (non-special) ASCII character that is not a space.