Here’s my PythonScript script, which can output a CSV or TSV file while ensuring that each row has the right number of columns and any instances of the column separator inside a column are handled correctly.
''' ====== SOURCE ====== Requires PythonScript (https://github.com/bruderstein/PythonScript/releases) Based on this question: https://community.notepad-plus-plus.org/topic/25962/separate-question-and-answer-about-1000-topics ====== DESCRIPTION ====== Converts a list of questions in the following format into a RFC-4180 compliant CSV file (in other words, a CSV file that is designed to be easy for lots of applications to read) ====== EXAMPLE ====== Assume that you have the text below (between the ------------ lines): ------------ 1.1.1 This is question number 1? "This is answer" 1 with option 1 This, is answer 1, with option 2 1.1.2 This is question number 2? This is answer 2, with "option 1" This is answer 2 with option 2 This is answer 2 with option 3 1.1.3 This is question "number 3"? This is answer 3 with option 1 1.1.4 This is question, number 4? This is answer 4 with option 1 This is answer 4, with option 2 This is answer 4 with "option" 3 This is answer 4 with option 4 ------------ This script will output the following CSV file (between the ------------ lines) ------------ question,option 1,option 2,option 3,option 4 1.1.1 This is question number 1?,"""This is answer"" 1 with option 1","This, is answer 1, with option 2",, 1.1.2 This is question number 2?,"This is answer 2, with ""option 1""",This is answer 2 with option 2,This is answer 2 with option 3, "1.1.3 This is question ""number 3""?",This is answer 3 with option 1,,, "1.1.4 This is question, number 4?",This is answer 4 with option 1,"This is answer 4, with option 2","This is answer 4 with ""option"" 3",This is answer 4 with option 4 ------------ ''' from Npp import editor, notepad import json def convert_q_list_to_csv_main(): # this is set to ',' to make a CSV file. # you could instead use '\t' if you wanted a TSV (tab-separated variables) file SEP = ',' question_lines = [] def to_RFC_4180(s: str, sep: str) -> str: if '"' in s or sep in s or '\r' in s or '\n' in s: return '"' + s.replace('"', '""') + '"' return s editor.research(r"((?'question'^\d+\.\d+\.\d+ +(.*\?)$))(?:\R(?!(?&question)).*)+", lambda m: question_lines.append(m.group(0).splitlines())) print(json.dumps(question_lines, indent=4)) max_n_options = max(len(x) for x in question_lines) header_text = 'question' + SEP + SEP.join('option %d' % ii for ii in range(1, max_n_options)) out_line_texts = [header_text] for question in question_lines: RFC_4180_texts = [] for ii in range(max_n_options): if ii >= len(question): RFC_4180_texts.append('') else: RFC_4180_texts.append(to_RFC_4180(question[ii], SEP)) out_line_texts.append(SEP.join(RFC_4180_texts)) notepad.new() editor.setText('\r\n'.join(out_line_texts)) if __name__ == '__main__': convert_q_list_to_csv_main() del convert_q_list_to_csv_mainBefore you ask, I made the odd programmatic choice to define helper functions and global constants inside the main function to avoid polluting the global PythonScript namespace.