Rocksolid Light

News from da outaworlds

mail  files  register  groups  login

Message-ID:  

Better hope the life-inspector doesn't come around while you have your life in such a mess.


comp / comp.lang.tcl / Re: unicode text

SubjectAuthor
* unicode textsaito
`* Re: unicode textMichael Soyka
 `- Re: unicode textsaito

1
Subject: unicode text
From: saito
Newsgroups: comp.lang.tcl
Organization: A noiseless patient Spider
Date: Sat, 9 Nov 2024 02:28 UTC
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: saitology9@gmail.com (saito)
Newsgroups: comp.lang.tcl
Subject: unicode text
Date: Fri, 8 Nov 2024 21:28:54 -0500
Organization: A noiseless patient Spider
Lines: 5
Message-ID: <vgmhd8$3etp3$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 09 Nov 2024 03:28:56 +0100 (CET)
Injection-Info: dont-email.me; posting-host="92854d3f126831288b2062d75d606cda";
logging-data="3634979"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/f3JtCAQnQf3F9Z2VWfEGw"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:t1fqyh7nF8oOjjUXG1Ak2u6xZv0=
Content-Language: en-US
View all headers

Is there a way to remove emojis, non-printable and other graphic
characters from a string? I can use a regexp with a-zA-Z and such but
this doesn't account for valid characters from non-ascii/non-Western
languages, right?

Subject: Re: unicode text
From: Michael Soyka
Newsgroups: comp.lang.tcl
Organization: self
Date: Sat, 9 Nov 2024 03:15 UTC
References: 1
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: mssr953@gmail.com (Michael Soyka)
Newsgroups: comp.lang.tcl
Subject: Re: unicode text
Date: Fri, 8 Nov 2024 22:15:20 -0500
Organization: self
Lines: 8
Message-ID: <vgmk48$3fpdg$1@dont-email.me>
References: <vgmhd8$3etp3$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 09 Nov 2024 04:15:21 +0100 (CET)
Injection-Info: dont-email.me; posting-host="731ede8cd53b53ed555bf6563be73e87";
logging-data="3663280"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/REIAl2a99zhpU/zMwMXpC"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:PAm0iPXkiZX+HYrB5dLAGkhJltU=
Content-Language: en-US
In-Reply-To: <vgmhd8$3etp3$1@dont-email.me>
View all headers

On 11/08/2024 9:28 PM, saito wrote:
> Is there a way to remove emojis, non-printable and other graphic
> characters from a string? I can use a regexp with a-zA-Z and such but
> this doesn't account for valid characters from non-ascii/non-Western
> languages, right?
>
I've found that this regular expression works for emojis:
[^[:print:][:cntrl:]]

Subject: Re: unicode text
From: saito
Newsgroups: comp.lang.tcl
Organization: A noiseless patient Spider
Date: Sat, 9 Nov 2024 17:57 UTC
References: 1 2
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: saitology9@gmail.com (saito)
Newsgroups: comp.lang.tcl
Subject: Re: unicode text
Date: Sat, 9 Nov 2024 12:57:27 -0500
Organization: A noiseless patient Spider
Lines: 11
Message-ID: <vgo7q7$3t685$1@dont-email.me>
References: <vgmhd8$3etp3$1@dont-email.me> <vgmk48$3fpdg$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 09 Nov 2024 18:57:28 +0100 (CET)
Injection-Info: dont-email.me; posting-host="ab0d08e8ac772a511d4eb961b3e804be";
logging-data="4102405"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+EwbHmIFv16LN2Fnm4LKXc"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:7ndJe05lP3ygCC8XTaEbbVTZYyY=
Content-Language: en-US
In-Reply-To: <vgmk48$3fpdg$1@dont-email.me>
View all headers

On 11/8/2024 10:15 PM, Michael Soyka wrote:
> On 11/08/2024 9:28 PM, saito wrote:
>> Is there a way to remove emojis, non-printable and other graphic
>> characters from a string? I can use a regexp with a-zA-Z and such but
>> this doesn't account for valid characters from non-ascii/non-Western
>> languages, right?
>>
> I've found that this regular expression works for emojis:
>    [^[:print:][:cntrl:]]

Thanks! That is a good start.

1

rocksolid light 0.9.8
clearnet tor