Normal view MARC view ISBD view

Approach to adding knowledge constraints to a data-driven generative model for Carnatic rhythm sequence

By: Ganguli, Kaustuv Kanti .

Contributor(s): Guedes, Carlos .

Publisher: Noida STM Journals 2019Edition: Vol.9(3), Sep-Dec.Description: 11-17p.Subject(s): Electrical Engineering

Online resources: Click here In: Trends in electrical engineering (TEE)Summary: Computational models for generative music have been a recent trend in AI based technology developments. However, an entirely data-driven strategy often falls short of capturing the naturally occurring rhythmic grouping. Guedes et al. [1,2] had proposed dictionary based and stroke-grouping based approaches to generate novel sequences in the 8-beat cycle of Aditala. More recently an attempt of incorporating arithmetic partitioning, as conceived by performers, was made [3] to get rid of the drawback of the former model being failure to capture long-term structure and grammar of this particular idiom and being only successful in capturing local and short-term phrasing. One way of solving this issue would be to consider a rhythmic phrase as a gestalt i.e. to hypothesize three rationales: (i) a sequence of strokes, when played in a faster speed, behaves as an independent unit and not a mere compressed version of the reference; (ii) context influences the accent – the same phrase is played differently when as a part of a composition versus as a filler (ornamentation) during improvisation; (iii) phrases show a co-articulation effect – the gesture differs in anticipation of the forthcoming stroke/pattern. Initial experiments show that a time-compressed version of the reference phrase played in 4x speed sounds perceivably different from the same reference phrase played at 4x speed by the same musician. This indicates that there is a gestural difference in articulating the same phrase at different speeds. We extract timbral features to understand the differences, though there is a context-dependence that seems to have been captured in a supra-segmental way, motivating us to investigate prosodic features. This indicates that a syntactically correct sequence may not serve as a semantically plausible one to a musician’s expectancy. As the qualitative evaluation of CAMel [1] involves expert listening, we believe, adding the proposed knowledge constraints would add to the naturalness, hence acceptability, of the generated sequences

Tags from this library: No tags from this library for this title. Log in to add tags.

average rating: 0.0 (0 votes)

Holdings ( 1 )
Title notes
Comments ( 0 )
Images

Item type	Current location	Call number	Status	Date due	Barcode	Item holds
Articles Abstract Database	School of Engineering & Technology Archieval Section		Not for loan		2020-2021216

Total holds: 0

Computational models for generative music have been a recent trend in AI based technology developments. However, an entirely data-driven strategy often falls short of capturing the naturally occurring rhythmic grouping. Guedes et al. [1,2] had proposed dictionary based and stroke-grouping based approaches to generate novel sequences in the 8-beat cycle of Aditala. More recently an attempt of incorporating arithmetic partitioning, as conceived by performers, was made [3] to get rid of the drawback of the former model being failure to capture long-term structure and grammar of this particular idiom and being only successful in capturing local and short-term phrasing. One way of solving this issue would be to consider a rhythmic phrase as a gestalt i.e. to hypothesize three rationales: (i) a sequence of strokes, when played in a faster speed, behaves as an independent unit and not a mere compressed version of the reference; (ii) context influences the accent – the same phrase is played differently when as a part of a composition versus as a filler (ornamentation) during improvisation; (iii) phrases show a co-articulation effect – the gesture differs in anticipation of the forthcoming stroke/pattern. Initial experiments show that a time-compressed version of the reference phrase played in 4x speed sounds perceivably different from the same reference phrase played at 4x speed by the same musician. This indicates that there is a gestural difference in articulating the same phrase at different speeds. We extract timbral features to understand the differences, though there is a context-dependence that seems to have been captured in a supra-segmental way, motivating us to investigate prosodic features. This indicates that a syntactically correct sequence may not serve as a semantically plausible one to a musician’s expectancy. As the qualitative evaluation of CAMel [1] involves expert listening, we believe, adding the proposed knowledge constraints would add to the naturalness, hence acceptability, of the generated sequences