Search My Techie Guy

Wednesday, June 23, 2021

How to recursively search for a repeated character in a text file and replace it with a new line using SED (e.g comma-delimited text)

Summary: 

In this post, I am going to show you how to search for repeated character in a text file and replace it with a new line to create a list. 

Problem or Goal: 

Take for example the task below, I have a log file containing parameters separated by a comma (","), see snippet below: 

3gdt,aace_support,access_restriction,adc,add,apn_conversion_wg,apn_redirection,apn_resolution_extension,attach_iot_limit,authentication_stationary_subscriber,conversational_qos_class,detach_inactive_subscriber_da,dual_access_support,dual_transfer_mode,ebm,edge_support,enhanced_uplink_mbr,gb_over_ip,gsm_adaptive_paging,gtp_user_location,gtpprime,gw_failure_restoration_gsm,gw_failure_restoration_wcdma,highest_qos_imsi,imei_check,integrated_traffic_capture,ipsec_support,lawful_interception,national_roaming_restriction,nw_init_sec_pdpctxt,payload_limit,pdp=1575,ps_ho,qos_hsdpa_mbr,rim_transfer,s_cdr_cause_code_ext,sau=2200,sau_lte=1,secondary_context,selective_service_request,sgsn_pool,sms_limit,srns_relocation,ss7_over_ip,streaming_qos_class,subscription_restriction,ue_signalling_control,ue_trace_mme

These are Ericsson SGSN-MME features, my task is to present these features in a neat list :-)

Cause:

Most log files are comma-delimited (Comma-delimited is a type of data format in which each piece of data is separated by a comma. This is a popular format for transferring data from one application to another, because most database systems are able to import and export comma-delimited data.)

Solution: 

The SED command below will read the file recursively and spit out a very nice list of Ericsson SGSN-MME features. 

linux-v7yi:/home# cat SGSN01-ER-SER-AD-Features-Summary.txt | sed -E 's/,/\n/g'
3gdt
aace_support
access_restriction
adc
add
apn_conversion_wg
apn_redirection
apn_resolution_extension
attach_iot_limit
authentication_stationary_subscriber
conversational_qos_class
detach_inactive_subscriber_da
dual_access_support
dual_transfer_mode
ebm
edge_support
enhanced_uplink_mbr
gb_over_ip
gsm_adaptive_paging
gtp_user_location
gtpprime
gw_failure_restoration_gsm
gw_failure_restoration_wcdma
highest_qos_imsi
imei_check
integrated_traffic_capture
ipsec_support
lawful_interception
national_roaming_restriction
nw_init_sec_pdpctxt
payload_limit
pdp=1575
ps_ho
qos_hsdpa_mbr
rim_transfer
s_cdr_cause_code_ext
sau=2200
sau_lte=1
secondary_context
selective_service_request
sgsn_pool
sms_limit
srns_relocation
ss7_over_ip
streaming_qos_class
subscription_restriction
ue_signalling_control
ue_trace_mme

Problem Solved?

Yes, a simpler example to understand is below:

linux-v7yi:/home# echo a,b,c,d,e,f,g,h,i,j,k | sed -E 's/,/\n/g'
a
b
c
d
e
f
g
h
i
j
k
linux-v7yi:/home#